Zipf law plot (frequency as function of frequency rank) for various texts.

The languages, texts and the word frequency files are:

Synthetic languages imitating Voynichese, the language of the ''[[Voynich Manuscript]]''. Text generated by Gordon Rugg with a software implementation of his proposed 'table-and-grille' method.

* Whole text. Sample: ''<nowiki>ky sheey keeaiin qoty cheol sheaiin sheedy qokeedy rkey otey qokdy yty</nowiki>'' [...] ''<nowiki>keey yshedy okar shey kdy okeoldy</nowiki>''. File voyp/grs/tot.1/gud.wfr (1950 words, ''N'' = 635 distinct).

Voynichese, the language of the ''[[Voynich Manuscript]]''. Prose-like parts from Majority Vote version of the text, excluding 'labels'. Extracted from the Landini/Zandbergen Interlinear Transcription 1.6e6.

* 'Herbal' section, language A, part 1 (pages f1v-f11v,f13r-f25v,f27r-f30v,f32r,f32v,f35r-f38v,f42r,f42v,f44r-f45v,f47r,f47v,f51r-f54v,f56r,f56v). Sample: ''<nowiki>kchsy chadaiin ol oltchey char cfhar am yteeay char or ochy dcho lkody</nowiki>'' [...] ''<nowiki>sokchol chol chol daiin</nowiki>''. File voyn/prs/hea.1/gud.wfr (original 6726 words, truncated/filtered to 6703 words, ''N'' = 1980 distinct).


* 'Herbal' section, language B, part 1 (pages f26r,f26v,f31r,f31v,f33r-f34v,f39r-f41v,f43r,f43v,f46r,f46v,f48r,f48v,f50r,f50v,f55r,f55v,f57r,f66v). Sample: ''<nowiki>psheoky odaiir qoy ofseod chypchey ypchedy ain chofo chcphdy dchey</nowiki>'' [...] ''<nowiki>daiin dal kal ykedaiin shedar okchdy oty</nowiki>''. File voyn/prs/heb.1/gud.wfr (original 2827 words, truncated/filtered to 2820 words, ''N'' = 1111 distinct).

The word frequency files '*/*/*/gud.wfr' are available at the [https://www.ic.unicamp.br/~stolfi/EXPORT/projects/voynich/Notes/tr-stats/dat/ UNICAMP website].  The original annotated full texts, before truncation/filtering, are in the companion files */*/org/main.src.  The truncated/filtered texts -- one word per line, without punctuation -- are in */*/*/gud.tlw.