Textual analysis on “Capital in the Twenty-First Century”

The 16 words used 500+ times

Here is the most frequent words in “Capital in the Twenty-First Century”*1.
A chart of most frequent words in "Capital in the Twenty-First Century"

Word Times
capital 2366
income 2156
percent 1935
wealth 1508
national 908
countries 883
tax 870
century 786
rate 741
growth 728
inequality 666
france 624
top 579
world 549
states 542
united 500

Count of countries and regions

Word count by countries and regions
Ref. wc_countries

Count of years, decades and centuries

Word Times
2010 384
nineteenth 317
twentieth 262
2012 141
1910 136
eighteenth 137
1950 132
1970 128
1980 112
1914 109
1945 104
1970s 88
1980s 86
2013 82
1990 81
1913 75
2000 75
1900 69
1820 58
1870 52
cat var/wc.csv | awk -F',' '{if ($1 ~ /1[8-9][0-9][0-9]|20[0-9][0-9]|nineteenth|twentieth|eighteenth/) print $1","$2}'

General aggregation

Number
Total words 258837
Unique words 10468

Sources

Footnote

  1. Precisely, the list doesn’t contain stop words. In this research, I referred MySQL 5.5 Full-Text Stopwords to extract words.

Author: @i05

A Tokyo based software developer, a grad student belongs to a security lab at JAIST, and a husband of my mrs.