Zipf law plot (frequency as function of frequency rank) for various texts.

The languages, texts and the word frequency files are:

[[Russian language|Russian]]. Text of novel ''[[Roadside Picnic]]'' (''Piknik na obochine'') by [[Arkady Strugatsky|Arkady]] and [[Boris Strugatsky]].  Transliterated from Russian to Latin letters, e.g. 'ю' ⟶ 'yu', 'щ' ⟶ 'shch', with numerals excluded.

* Whole text. Sample: ''<nowiki>nakanune stoim eto my s nim v hranilishche uzhe vecherom ostaetsya</nowiki>'' [...] ''<nowiki>prihozhej poslyshalis' sharkayushchie shagi postukivanie i</nowiki>''. File russ/pic/tot.1/gud.wfr (original 45915 words, truncated/filtered to 35027 words, ''N'' = 9761 distinct).

The first five books (the ''Pentateuch'') from the [[Synodal Russian Bible]] (1876). Translated from Old Slavonic, with many archaic words. Romanized, all lowercase.

* All five books. Sample: ''<nowiki>v nachale sotvoril bog nebo i zemlyu zemlya zhe byla bezvidna i pusta i</nowiki>'' [...] ''<nowiki>v den' sobraniya i otdal ikh gospod' mne i</nowiki>''. File russ/ptr/tot.1/gud.wfr (original 111824 words, truncated/filtered to 35027 words, ''N'' = 5520 distinct).

The word frequency files '*/*/*/gud.wfr' are available at the [https://www.ic.unicamp.br/~stolfi/EXPORT/projects/voynich/Notes/tr-stats/dat/ UNICAMP website].  The original annotated full texts, before truncation/filtering, are in the companion files */*/org/main.src.  The truncated/filtered texts -- one word per line, without punctuation -- are in */*/*/gud.tlw.