Zipf law plot (frequency as function of frequency rank) for various texts.
The languages, texts and the word frequency files are:
Voynichese, the language of the ''[[Voynich Manuscript]]''. Prose-like parts from Majority Vote version of the text, excluding 'labels'. Extracted from the Landini/Zandbergen Interlinear Transcription 1.6e6.
* Pages f65r and f65v, unknown text type. Sample: ''otaim dam alam cphy fchecfhy dy dchepain shety qopy fol chpdy daiin'' [...] ''okey dy qo ytchey ykchedy chedy shckhdy''. File voyn/prs/unk.3/gud.wfr (44 words, ''N'' = 43 distinct).
The word frequency files '*/*/*/gud.wfr' are available at the [https://www.ic.unicamp.br/~stolfi/EXPORT/projects/voynich/Notes/tr-stats/dat/ UNICAMP website]. The original annotated full texts are in the companion files */*/org/main.src. The extracted texts -- one word per line, without punctuation -- are in */*/*/gud.tlw.