Wrote a small script to generate KWIC index for a given word. Here is the result: cat .fix.wds \ | kwic-index -v key=oHae cccHcc8a ??? 8arok // zoeHc8a oHae 8ar oHa oHar oHar oe z // zor zccHa qoHam oHae 8ae oeccc8a 8am ??? ??? ccc8a qoe zccoe oeccc8a // oHae oPae oHcca eoe ??? Hc8a // oHc8a qcHa ??? oea oHae zccc8a qoHcc8a qoHcca ea // zae 8oe zc8a oeHc8a qoHam oHae ??? // = qHor zcc8a ccc8or ccae zcc8 qoHcc8a ??? oHae ccc8ae // 8ae or ccc8a ccca qoHcc8a oeHc8a ??? oHcc8a oHae // zaeHcca zor zcccHca 8am oecca // eccc8a zcc8ae qoHc8a oHae ccc8a qoHae zcc8a ccc8a qoe // 8zcc8a 8cc8a qoHcc8a oHc8a oHae Hc8a oHca ??? // qoHa // zoe Hcc8a zcccHca ??? oHae zcc8a oHar oe ccc8a cHc8a 8am aHc8a aoe zccca qoHcca oHae oe Hcc8a 8aHa // 8zcc8a oe cczca oe ccccHa 8ar oHae 8ae oeccc8a // cHa ??? // ??? zcca oe zcc8a oHae oHccca oHcoe oeccc8a qoe // zcoe zam zcccHa or ??? oHae oeHae qoHar ??? // zor or zccH oHar Pcc8a ??? oHae zcc8a // ??? ezc8a ??? Hzcca 8zc8a oHccar zccH ??? oHae ora // qoHcca qoHccca ??? // zcam zccHca qoea Hzcc8a oHae zccca qoHam zcca ??? // qoHam ccca qoHam oHam oHam oHae oe 8ak // ??? ccoe zcc8a qoHcc8a qoHam o8a ??? oHae // zoe zcccoe oe cccca oeoHccc8a qoHam qoHam ??? qoHam oHae qoHam oHc8a ??? // ??? qoHae 8a cccHca eccca ??? oHae // qoHc8a qoHcc8a ccccHca oeccc8a oe // zoeccca qoHcca eHar oHae oeHcca oHam zcccHca oe // qoHc8a 8aea // Hoe Ham oHae ccc8a qoHar oe zcc8a ccccHca roea // Hzcc8a qoHc8a oeHam oHae cccHca qoHa 8am ??? // = Hzcc8a qoHam zcc8a qoHaz oHae qoPzcc8a qoHa eccc8a ??? // eccc8a ??? Hoe ccca qoHcc8a oHae 8am oe 8ae // cccoe oe oHam // Hor Ham oHae a rccca qoHae oeor am a rccca qoHae oeor am oHae ??? // ??? 8ccc8a qoHae ??? zcc8a qoe ??? oHaw oHae ccc8a 8a // Pccc8a qoHca zccHa // qoHc8a oHam Haw oHae zar oe ??? oeHam ae aHc8a 8ar ??? qoHa aHc8a oHae // 8zcc8a aHcc8a ??? 8am ??? zcc8a qoe zcc8am zcccHca oHae zccHa qoHam ccc8oe // 8zcc8a cccca ram zccz qoHccca qoHam oHae 8a ??? // oram zcc8a ccc8a oHa cc8a 8a ccccHca oHae ccez // Hoe zccoe qoHc8a ccca qzam qoHcca qoHam oHam oHae oHcc8a qoHae 8ak // zam ??? // ??? qocHcoe zcccHca oHae qoHcae zcccHc8a cHae oHc8a qoHc8a 8aezc8a // qoHam ccc8a qoHae oHae ccc8a qoHaez // oazcca qoe qoHam ccoe cccHa ??? ccca oHae ccc8a Hora oHzcc8a qoHca ezcc8a // Poe Har zcc8a qoHc8a oHae zcca qoHar cccHca oHccca qoHccc8a ??? ??? qoHcc8a 8am ccccHca oHae oe Hc8a ccc8a 8oeHc8a oHc8ak // ??? ??? ??? ??? oHae ccc8a ccc8a // ??? oe qoHca qoHcoea // ccae 8am oHae cc8a oHae cccHcor aea // // ccae 8am oHae cc8a oHae cccHcor aea // qoHc8a zcca It seems that "oHae" is often preceded or followed by "ccc8a" or "zcc8a". cat .fix.wds \ | kwic-index -v key=roe oHae8a 8ar oHar oHc8a 8a roe // Hccc8a Pccc8a qoHcca ??? PaHc8a oePccc8a qoHc8a zPcca ccc8a roe ??? oPccc8a qoHc8a // oezcc8a ??? ??? qoHcc8a qoHar oHcc8a roe ??? // ??? oeHcca oe ear // eoe ccca ??? roe 8am oHa qoHa oHaeor zcccHca ror cccca zcccHca qoHam ccc8a roe // acccoe Ham zcca qoHam aHam oeHam zcc8a qoHa 8ccc8a roe oe cHc8a // aHca oHccc8a ccccHa ??? 8ae ??? Pcc8a roe qoHc8a roe // 8ae zcoe 8ae ??? Pcc8a roe qoHc8a roe // 8ae zcoe 8ar oe cccHc8a qoHoe8a // 8oeccc8a eccca roe roe cccca zam ??? // qoHoe8a // 8oeccc8a eccca roe roe cccca zam ??? // = cat .fix.wds \ | kwic-index -v key=zar qoe zcc8a qoe oHam ccar zar oea // qoHzcc8a qoe zcca qa oe Hcca 8am zam zar zcc8a qoHcor oHcc8a // qoHcc8a ??? oHca 8ae aHae ccc8a zar // z ccor zcc8a qoHc8 zcc8a ??? HzccoHcc8a ozccPoez // zar oeHcca zcoHam zccoeoe oHc8a qcHcc8a zaw zcccHca eHcc8a eccc8a // zar zccc8a qoHcc8a qoeHca ecc8a ??? // oeccca qoHca qoHam oecccc8a zar ??? // zam ??? qoHcc8a 8am ccc8a qoe Hcc8a qoHccc8a zar ??? ccccHa 8a // qoHam qoHam oHaecca Hae or ccccHca zar // qoHa cccaqa Ham ccca oe or ??? // ??? zar oe ??? oHcca zcor qoHcca // qoHc8a oHam Haw oHae zar oe ??? oeHam ae oe qoHae zc oe zcaeza // zar oe eoram ccca qoHam o ccc8a eHc8a qoHccc8a qoHc8a cccHc8a zar // zccc8aw oHccc8a qoHcc8a ccc8am cat .fix.wds \ | kwic-index -v key=oeHam ??? 8c8a Hc8a // qoqoHcca oeHam qoe zccc8a qoHcor zccc8a qoHae // ??? qoe ccc8a oqoHam oeHam cccHca qoHae 8ar // qoHae qoHc8a oHc8a 8ae or oHcc8 oeHam // qoHoe oHc8 oHam ccc8 oe // qoHcc8a qoHca 8am oeHam 8ae ccc8a oeoe 8ae ccccHa Ham oe // qor oeHcca oeHam oe cczca oe ccccHa 8ar ??? zccca oe ccc8a oeHar oeHam // oeHam zcca qoHca oHar oe ccc8a oeHar oeHam // oeHam zcca qoHca oHar oe oHcc8a // qoHcc8a zccca Haz cccca oeHam oe ora ccoe ??? oHa zcca qoHc8a qoHc8a qoHc8a 8ar oeHam ccak // 8ccc8a eccca qcHa qoHcca oe oezc8a qoHam oHcc8a oeHam oHzcca zam oe // aHcc8a cccHca oHoe Hcoe zccoe qoHae oeHam cccHca // qoHcc8a qoe ??? Hcoe qoHa qoHae zcc8a zae oeHam ??? qoHe // azcca e ??? // ??? zcar oHam oeHam oe oeHam oram oeor ccccHca ??? zcar oHam oeHam oe oeHam oram oeor ccccHca oeor // ??? eoeok // qoHam ??? oeHam zcc8a qoHam oroe z // Pccc8a roea // Hzcc8a qoHc8a oeHam oHae cccHca qoHa 8am ??? Ham zcca qoHam ccc8a qoHoe oeHam zccHccae // eor ar oe Ham oeHar zcca qoHam ??? oeHam oHam zcae qoHa // oecccoe oHcc8a qoHca ea // qoHam oeHam oeHam oe Hccoe // Poe qoHca ea // qoHam oeHam oeHam oe Hccoe // Poe or oe Hccoe // Poe or oeHam ocHca qoHam oHam oHar zcca qoeHa // qoe oe ??? oeHam oe ccc8a qoHam oezcca ??? qoeHam zcc8a // qoHcc8a qoHcr oeHam oezcc8a qoHca qoHcca ??? ??? zccar zcca oezcca cHcor aecc8a oeHam zccHca qoea // Hoe ??? azccc8a zca ccca ecccHca aHar oeHam oe // e zcc8a qoHc8a Haw oHae zar oe ??? oeHam ae oe ??? 8e // Har // ??? ??? aHam oeHam zcc8a qoHa 8ccc8a roe oe ccca qoHccca qoHa oHam oHam oeHam oHam oHar aea // qoHam qoHam zccae qoe ccc8a qoHaiw oeHam zcc8a // = Hccae oeHaw qo8a zcar ??? qoHccca Hccca oeHam oPccc8a qPoe zcHa orae // ccoe ??? cccHca 8ae cccHca oeHam oeHccar // Pcccoe zcc8a qoHam cat .fix.wds \ | kwic-index -v key=oeHar ??? qca Ham zcccHa eHam oeHar or // 8acHca eHako aHcca oe ??? zccca oe ccc8a oeHar oeHam // oeHam zcca qoHca zcHcoe qoHar zccHa cccHam ??? oeHar oHam zccHa qoHae ??? // ??? // = Pzcoe Ham oeHar zcca qoHam ??? oeHam oHam zccHca qoea // Hoe ??? oeHar a qoe qoe Ham ??? // e zcc8a qoHc8a zor oeHar oe Ham oe Hzc8a // qocca ee8ar cccca qoae qoHcc8a oeHar zccc8a qoHam oeae // qoe Let's look at the ladies-in-tubes labels collected by Jim Reeds on page f77v: Locus Currier FSG HOP Comments --------- ----------- ----------- ----------- ----------------------------- 152 N1 1 OFAN/AFOE ODAN ADOE oHam aHoe ; ladies with hands in tubing 152 N2 1 OPOE/ZC89 OHOE SC89 oHoe zcc8a ; under N1 152 N3 1 OEFS8OE OEDT8OE oeHcc8oe ; center top 152 N4 1 OPOEOR OHOEOR oHoeor ; 152 N5 1 ORSC8AE ORTC8AE orccc8ae ; under N4 152 W1 1 2ORORAE 2ORORAE zororae ; above lady's head 152 W2 1 OECOC8N OECOC8N oecoc8m ; on her vascular boat's hull 152 E1 1 OFA ODA oHa ; "oHam aHoe": The word "oHam" is very common: 75v 76v 78v 79v 105 30.4 14.6 oHam .1.121223223.3.12.1..21.133422653535141114..3..43.33.12.111 The word "aHoe" doesnt occur at all, but it may be "oHoe", which does; see below. "oHoe zcc8a": The word "oHoe" occurs 11 times: 4 times followed by "ccc8a", once preceded by "zcc8a", once followed by "8ae zcc8a". The word "zcc8a" is very common. 75v 76v 78v 79v 11 33.9 16.2 oHoe ......1......1..........1.1...111.....1.............1....11 233 31.0 18.3 zcc8a 3362314964524.854555723254313.11335332325362432445686576476 qoHcc8a ??? oeHoe ocHcca zcc8a oHoe ??? e8a // oHczcca oeHc8a ccc8a qoHam ccc8a Ham cccHcc8a oHoe oHa // zam oHam zccHcc8a // ??? qoe zcc8a oeHc8a oHoe ccc8a qoHam 8ae // cccca // ??? qoHzcc8a qoHam ??? oHoe zcccoe ??? ccoHa oHcccaz // 8aHam oHc8a 8Hcca Har oe oHoe ??? // ??? ??? Har qor cccca Ham cce oe oHoe 8am oHam oe oHcc8a qoHcm zcc8a qoHca ??? qoHar cccHca oHoe Hcoe zccoe qoHae oeHam cccHca ??? oe ar zccHca eHoe oHoe ccc8a qoHam cccHca qoHae // // a zcca oHccc8 ar oHoe 8ae zcc8a qoHaea // ??? oe zcc8a qoHc8a // qoHcca oHoe ccc8a ??? // 8zccca qoHcc8a // zor ??? 8oe Hc8a oHoe ccc8a // = "oeHcc8oe": The word "oeHcc8oe" does not occur at all. However, "oeHcc" occurs once, right after the picture, on the facing page. The similar words "oeHcc8", "oeHcc8a", "oeHcc8ae" occur a few times later on (except for one early occurence of "oeHcc8a" on f75r). The word "8oe" appears first on f76r, and is fairly frequent later on. 75r 76r 78r 79r 84r 1 21.5 16.7 oeHcc .....................1..................................... 1 55.5 16.7 oeHcc8 .......................................................1... 15 40.1 13.4 oeHcc8a ..1........................1.11....11......2..121...11..1.. 1 52.5 16.7 oeHcc8ae ....................................................1...... 26 36.5 15.2 8oe ..........1.1..1.1..11.2...111...1..1.1.1.1.111....2....221 "oHoeor": The word "oHoeor" doesn't occur. However, "oHoe" occurs 11 times (see above), and "or" is quite common. Its distribution is lumpy, and there is a cluster beginning on f77v. 75r 76r 77v 66 28.5 16.3 or 13..3.2..14.1.....1.231235312.15....1412.2..1.....1..213112 "orccc8ae": The word "orccc8ae" does not occur. But "or" is fairly common (see above), and "ccc8ae" occurs thrice, roughly at the right place: 76r 77v 3 14.2 6.8 ccc8ae ..........1.1......1....................................... ccae zcc8 qoHcc8a ??? oHae ccc8ae // 8ae or ccc8a qoeam qoccHcc8a 8cc8a // Hccc8a ezccc8a ccc8ae ccc8a ccccHcca // Poezc8ae oHc8aw qoHcca ??? // Hzcc8a qoHam ccc8ae Pccc8a cccHa zam ccc8a qoHam "zororae": The word "zororae" does not occur. The words "zor" occurs 13 times, with no obvious clustering, and "orae" occurs once. The similar word "oroe" occurs 6 times, but not particularly in the right spot. But "or" occurs in the right place (see above) and "ae" do occurs in three clusters, nearby: 76r 77v 78v 80r 13 30.2 17.1 zor ...11......1......1..1...1........2....11........11.......1 66 28.5 16.3 or 13..3.2..14.1.....1.231235312.15....1412.2..1.....1..213112 20 21.5 15.2 ae ......311.22.111.........11..........112.1................1 1 48.5 16.7 orae ................................................1.......... 6 29.2 18.4 oroe ....1.1..........................1.1...1...............1... "oecoc8m": The word "oecoc8m" does not occur. Indeed, the sequence "coc" occurs only twice in the bio section, and "oc8" doesn't occur at all. So, the "o" may be a misreading. The sequences "cac" and "ac8" don't occur, either. If we read it as "oeccc8m", we get some close matches: 76r 77v 1 39.5 16.7 oeccc8 .......................................1................... 1 44.5 16.7 Poeccc8 ............................................1.............. 6 18.3 19.2 qoeccc8a 1....1.11..........................1................1...... 24 26.5 19.3 oeccc8a ...2.31...2.2....11....11.......1.....1....11..11.1...1...2 "oHa": The word "oHa" is fairly common, with a cluster of occurrences at the right spot. Its "inflected" forms "oHam", "oHae", "oHar" are even more common, with suggestive clustering. 76r 77v 31 25.9 14.6 oHa ..11...1.231.2......1111...1.12.12..1..2.1.1..21......1.... 105 30.4 14.6 oHam .1.121223223.3.12.1..21.133422653535141114..3..43.33.12.111 43 32.8 15.9 oHae ..11.11...21......1.1.21123.1..1111122..1111..2.11...12..22 40 20.4 16.7 oHar 214211.113121.......2.11.3...12.1.1..1.2..1....1......11.1. I made this report into an HTML page in my Voynich site. I take these statistics as being mildly encouraging: most of the labels can be found in the text, and in several cases there is a cluster of occurrences on page f77v or soon thereafter. On the positive side, I think this data supports my belief that Voynichese is a natural language (and not a complex cypher or random text), and that the "words" are indeed words (i.e. units of meaning). On the negative side, I am worried by the apparent inconsistency in spelling and word spacing in the manuscript itself. These observations give some more weight to the "ignorant scribe" hypothesis (that the Beinecke VMs is a copy, made by two or more scribes who could not understand the original). The apparent confusion between "a" and "o" in the original manuscript suggests that I should identify those two letters in my next error-tolerant encoding...