Zipf law plot (frequency as function of frequency rank) for various texts.

The languages, texts and the word frequency files are:

[[English language|English]]. Text of the ''[[Wakefield Mystery Plays]]'' aka ''Towneley Mystery Plays'', popular plays on religious themes (1460). The language is [[Middle English]] with archaic characters replaced.

* Whole text (minus Latin insertions). Sample: ''<nowiki>i am the first the last also oone god in mageste meruelus of myght most</nowiki>'' [...] ''<nowiki>and with in this chyld may</nowiki>''. File engl/twp/tot.1/gud.wfr (original 81498 words, truncated/filtered to 35027 words, ''N'' = 4202 distinct).

Text from [[Nicholas Culpeper]]'s herbal medicine handbook ''The English Physitian'' (1652); excluding numerals, Latin insertions, marginal notes, verses, titles, etc..

* Whole text. Sample: ''<nowiki>courteous reader aristotle in his metaphysicks writing of the nature of</nowiki>'' [...] ''<nowiki>so will a paper also if it do but touch the water your best way then</nowiki>''. File engl/cul/tot.1/gud.wfr (original 122229 words, truncated/filtered to 35027 words, ''N'' = 3544 distinct).

Text of [[H. G. Wells]]'s novel ''[[The War of the Worlds]]'' (1898), excluding numbers, mapped to lowercase.

* Whole text. Sample: ''<nowiki>no one would have believed in the last years of the nineteenth century</nowiki>'' [...] ''<nowiki>there were already a couple of score of passengers aboard some of</nowiki>''. File engl/wow/tot.1/gud.wfr (original 60293 words, truncated/filtered to 35027 words, ''N'' = 4869 distinct).

The word frequency files '*/*/*/gud.wfr' are available at the [https://www.ic.unicamp.br/~stolfi/EXPORT/projects/voynich/Notes/tr-stats/dat/ UNICAMP website].  The original annotated full texts, before truncation/filtering, are in the companion files */*/org/main.src.  The truncated/filtered texts -- one word per line, without punctuation -- are in */*/*/gud.tlw.