humanity care to imagine we ’re unpredictable existence , to a certain extent , governed by complimentary will issue somehow from physical processes . Well , here ’s one weird thing to send you into a linguistics - based existential crisis : most spoken language appear to survey an equality know as Zipf ’s legal philosophy , and we have no idea why .
word are used with varying frequency , as you might bear . You have more usage for the word " the " than you do for the Word of God " cosmopolitan " or " phubbing " , for example . But analyzing the frequency of word use in large text reveals that it closely follows a specific statistical law .
" About 80 years ago , George Kingsley Zipf reported an observation that the frequency of a word seems to be a power police affair of its frequency rank , develop as f(r ) ∝ 𝑟𝛼 , wherefis word frequence , ris the rank of frequency , and 𝛼 is the index , " apaperon the matter explain .

Zipf’s law applies to the first 10 million words in 30 different languages on Wikipedia.Image credit: SergioJimenez/Wikimedia Commons(CC BY-SA 4.0)
To put it just , the most frequently used word in a language – in English , " the " – is used twice as often as the next most vulgar word of honor , and three time as often as the next , and four times as often as the next , and so on following this mightiness law for a amazingly long clip .
You may think this is some weird quirk of English , but it is n’t . Zipf ’s practice of law appears to hold to almost all speech communication that have been looked into . No matter whether you are speaking English , Hindi , French , Mandarin , or Spanish , the frequency of a Christian Bible appears to shed off scaling to its popularity membership .
Weirder still , it even applies to languageswe have n’t even decode yet . Even the words appearing in themysterious Voynich Manuscriptappear tofollow this jurisprudence . And single texts , if they are bombastic enough , will around follow these laws too , with the top - ranked word appearing twice as much as the next etc , etc . Even Charles Darwin ca n’t germinate his path out of this one , with one analysis finding it applies fairly neatly to his textOn the Origin of Species . In fact , it lop up all over the place .
So , that ’s jolly weird , no ?
" It is deserving reflecting on the specialty of this law , " a review of the topicexplains . " It is certainly a nontrivial property of human linguistic communication that countersign vary in frequency at all ; it might have been reasonable to expect that all word should be about equally frequent . But given that words do vary in oftenness , it is unclear why intelligence should keep an eye on such a precise mathematical rule – in particular , one that does not reference any aspect of each word ’s signification . "
There are many potential explanation for the idea , from statistical problem to constraint imposed by human memory and lexicon . George Zipf himself proposed that the law come from a balance of movement minimisation , with speakers ( or writers ) attempting to denigrate their own effort by using more frequently fall out row , and listeners ( or readers ) seek clarity in language from less - frequently used words . An extension of this is that humans attempt to get meaning as expeditiously as potential , tend towards using words that maximize the amount of information they can bring .
Another idea is that more coarse words tend to become more democratic over time as spoken language spreads and develops , leading to a sort of snowball effect . But none are really accepted as the account , and the cause behind it remains a bit of a secret .
If you would really like to send yourself into a linguistics - based experiential crisis , you could even paste your own ( foresighted ) text / novel / paper into adistribution calculatorand see if it obeys Zipf ’s law . You might not care how predictable your use of language may seem , but venerate not , evenShakespeare’sHamletappears to follow it too .