Zipf law plot (frequency as function of frequency rank) for various texts.
The languages, texts and the word frequency files are:
Synthetic languages imitating Voynichese, the language of the ''[[Voynich Manuscript]]''. Text generated manually by Gordon Rugg with his proposed 'table-and-grille' method.
* Whole text. Sample: ''olkshedy otedy qocheol ochecthdy aiin qochekdy rchey qol ol okdy'' [...] ''okeey yky olchedy ky cheol kd oshey ol''. File voyp/grm/tot.1/gud.wfr (708 words, ''N'' = 307 distinct).
Voynichese, the language of the ''[[Voynich Manuscript]]''. Prose-like parts from Majority Vote version of the text, excluding 'labels'. Extracted from the Landini/Zandbergen Interlinear Transcription 1.6e6.
* 'Herbal' section, language A, part 1 (pages f1v-f11v,f13r-f25v,f27r-f30v,f32r,f32v,f35r-f38v,f42r,f42v,f44r-f45v,f47r,f47v,f51r-f54v,f56r,f56v). Sample: ''kchsy chadaiin ol oltchey char cfhar am yteeay char or ochy dcho lkody'' [...] ''sokchol chol chol daiin''. File voyn/prs/hea.1/gud.wfr (original 6726 words, truncated/filtered to 6703 words, ''N'' = 1980 distinct).
[[Greek language|Greek]]. Text ''[[Byzantine text-type]]'' or ''Majority Text'' version of the ''[[New Testament]]'' in vulgar Byzantine Greek (''koinƩ''), from 300 CE or earlier, in a had-hoc enconding of the Greek alphabet into ISO Latin-1.
* Whole text (27 books). Sample: ''biblos geneseōs iėsou qristou uiou dauid uiou abraam abraam egennėsen'' [...] ''maršas tės''. File grek/nwt/tot.1/gud.wfr (original 66183 words, truncated/filtered to 35027 words, ''N'' = 5436 distinct).
The word frequency files '*/*/*/gud.wfr' are available at the [https://www.ic.unicamp.br/~stolfi/EXPORT/projects/voynich/Notes/tr-stats/dat/ UNICAMP website]. The original annotated full texts, before truncation/filtering, are in the companion files */*/org/main.src. The truncated/filtered texts -- one word per line, without punctuation -- are in */*/*/gud.tlw.