Zipf law plot (frequency as function of frequency rank) for various texts. The languages, texts and the word frequency files are: [[Hebrew language|Hebrew]]. The first five books (''[[Torah]]'', ''Pentateuch'') of the Hebrew Bible (''Tanak''). From the 10th century version (the [[Masoretic text]]) of the original, probably composed mainly around ~500 BCE from earlier texts. From the ''Sacred Texts'' site, maintained by John B. Hare. In an ad-hoc single-byte encoding designed to look vaguely phonetic under an ISO-Latin-1 font. '''With''' vowel points but '''without''' cantillation marks. * Book 1, ''Bereis'' (''Genesis''). Sample: ''<nowiki>bĪ°rëĄsđïyą bĪâr⥠Ą°ęlöhïym Ąëą häsĪđâmäyïm w°Ąëą hâĄâręþ w°hâĄâręþ</nowiki>'' [...] ''<nowiki>wäyĪäįän°twĪ Ąöąwö wäyĪïysēęm bĪâĄârwön bĪ°mïþ°râyïm</nowiki>''. File hebr/tav/gen.1/gud.wfr (17211 words, ''N'' = 7212 distinct). * Book 2, ''Shmot'' (''Exodus''). Sample: ''<nowiki>w°ĄëlĪęh sđ°mwöą bĪ°nëy yïsē°râĄël häbĪâĄïym mïþ°rây°mâh Ąëą yäŋ°äqöb</nowiki>'' [...] ''<nowiki>bĪwöl°ŋëynëy kâlbĪëyąyïsē°râĄël bĪ°kâlmäs°ŋëyhęm</nowiki>''. File hebr/tav/exo.1/gud.wfr (13870 words, ''N'' = 5711 distinct). * Book 4, ''Bamidbar'' (''Numeri''). Sample: ''<nowiki>wäy°däbĪër y°hwâh Ąęlmösđęh bĪ°mïd°bĪär sïynäy bĪ°Ąöhęl mwöŋëd bĪ°Ąęįâd</nowiki>'' [...] ''<nowiki>bĪ°ŋär°böą mwöĄâb ŋäl yär°dĪën y°rëįwö</nowiki>''. File hebr/tav/num.1/gud.wfr (13573 words, ''N'' = 5306 distinct). * Book 3, ''Vaykra'' (''Leviticus''). Sample: ''<nowiki>wäyĪïq°râ Ąęlmösđęh wäy°däbĪër y°hwâh Ąëlâyw mëĄöhęl mwöŋëd lëĄmör</nowiki>'' [...] ''<nowiki>ĄęąmösđęhĄęlbĪ°nëy yïsē°râĄël bĪ°här sïynây</nowiki>''. File hebr/tav/lev.1/gud.wfr (9650 words, ''N'' = 3860 distinct). * Book 5, ''Devarim'' (''Deuteronomium''). Sample: ''<nowiki>ĄëlĪęh hädĪ°bârïym Ą°äsđęr dĪïbĪęr mösđęh ĄęlkĪâlyïsē°râĄël bĪ°ŋëbęr</nowiki>'' [...] ''<nowiki>hämĪwör⥠hägĪâdwöl Ą°äsđęr ŋâsēâh mösđęh l°ŋëynëy kĪâlyïsē°râĄël</nowiki>''. File hebr/tav/deu.1/gud.wfr (12007 words, ''N'' = 5455 distinct). The word frequency files '*/*/*/gud.wfr' are available at the [https://www.ic.unicamp.br/~stolfi/EXPORT/projects/voynich/Notes/tr-stats/dat/ UNICAMP website]. The original annotated full texts are in the companion files */*/org/main.src. The extracted texts -- one word per line, without punctuation -- are in */*/*/gud.tlw.