Folder: MAIL/folders-splitted/vm-folders/voynich-98 From jim@mail.rand.org Fri Oct 2 01:32 EST 1998 Message-ID: <3614C69C.545F@alphalink.com.au> Reply-To: jguy@alphalink.com.au References: <361399B5.57FF3800@sprint.ca> Content-Transfer-Encoding: 7bit Date: Fri, 02 Oct 1998 05:27:08 -0700 From: jguy <jguy@alphalink.com.au> To: John Grove <4groves@sprint.ca> CC: voynich@rand.org Subject: Re: The Nature of the Analyst John Grove wrote: > > The forms all have counterparts starting with <i>: <ig>, <x>, <2>, > etc. We > also have <a> = <c>+<i>. My view is that <c> and <i> are equivalent, each occurring in the context of strokes similar to itself. Cryptologia published an article of mine a long time ago where I showed that the two sets of letters, the c-like and the i-like, occurred in almost completely mutually exclusive variation. That can be due (in a linguist's eye) to two things: 1. they are allographs of the same grapheme (like the two form of small beta in Greek) 2. extension vowel or consonant harmony Later, but still a long time ago, I argued on this list that <cc> and <a> were two different ways of writing "a". It had not even occurred to that <a> = <c> + <i> > All the letters containing an initial > "c"-curve are also the only letters that can be preceded in the same > word by the little letter that looks > like "c," e.g. <c89>, <ccc89>. On the other hand, the letters <x> and > <2> (which have very high frequencies) can *never* be preceded by > <c>, *ever*; they are instead > preceded by <a>." or <o>. > Now the fact that he saw these things as 'two-stroke' characters > seems promising to me -- as it supports my observations. However, it > may simply be that Currier was employed in roughly the same field as > I work in - and thus analyzes things from the same perspective. What > was his job? If he was a crytanalyst He was. But I am a linguist, and I reported the same phenomenon. I did not know about Currier at the time, either. So that makes his observation all the more credible. When results converge... > Jorge, on the other hand, has attacked the VMS from a linguistic > point of view Jorge is a computer scientist. So now that's three viewpoints that converge: cryptology, linguistics, computer science. > - there are just not enough characters in just the > right places to form a simple alphabetic language Yes there are! Look in the archives, before the invention of EVA, when we were groping for a "pronounceable Voynich", I came up with two: one looking like a sort of mock-Latin which sounded grand and mysterious (good for ceremonial magic?), another, more serious that looked much like a sort of Indonesian. Piraha, a south-american Indian language, has only three vowels and seven consonants. But it has tones. So it doesn't Rotokas (in Papua New-Guinea) which has five vowels and six consonants, but no tones. Further, another thought. If you get your hands on a New Testament in Bislama, the Pidgin English of Vanuatu, you'll see words like "God", and "kot" (coat), and "gyaman" (to lie). But Bislama has neither g nor d. "God" are "coat" are pronounced exactly alike: kot. What gives? The spelling was devised by a Rev. Camden, a Presbyterian. He it right, but the native Elders complained that it made the language look too "childish". So, very stupidly, he caved in, and decided on a spelling partly based on English. Hence "God" instead of "Kot". And "gyaman" instead of "kyaman". He thought that "kyaman" was a distortion of "gammon" but... it's a Chinese word! As for "mbusong" which his mob pronounced "putsong", he wrote it like he heard it: "pujong". Well, if I tell you that it meand "cork" you may guess the etymology: French bouchon. Perhaps a similar thing happened with the VMS, which would explain its haphazard spelling. At any rate the Voynich alphabet has enough letters to write Rotokas or Piraha, and enough to spare to write the allophonic variations of their phonemes. E.g. in Rotokas, t is pronounced ts or s before i. In Piraha t is pronounced either plain t, or t accompanied by a bilabial trill (the choice is apparently at the speaker's whim) > There are three things > about the lines that make me believe the line itself is a functional > unit. The frequency counts of the beginnings and endings of lines are > markedly different from the counts of the same characters internally. That is normal. The frequency counts of the beginnings and endings of lines in Italian are markedly different from the medial ones. Why? Because an Italian word almost always ends in a vowel, but usually starts with a consonant. No I haven't carried out any statistics on that, but I learnt Italian reading Topolino when I was 15, and the strange distribution of consonants and vowels had struck me: it did not look like a "real" language. I mean... "tavola" to say "table"? Are you joking? And this word: "popolo". Surely, Sir, you are pulling my leg! > There are, for instance, some characters that may not occur initially > in a line. In Ancient Basque, no word could start with a "k", a "p", or a "t". Ancient Basque had no "m". When Basque acquired an "m", it was from "nb" becoming "mb", then "m", so no word could start (or end) with an "m". Since line length varies in the VMS, we can infer that the authors did not break words. Or, if they did, they did it at syllable boundaries. In all the languages I can think of right now (French, English, Italian, Swahili, Spanish, Chinese, Japanese...) the distribution of phonemes is quite different syllable-initially, medially, and finally. > Okay, if the first character of a line is a line indicator of some > sort... If the VMS is written in a natural human language, like the many I have learnt, the even more many I know about, then there is nothing there to write home about. The scribes did not break words randomly, like th is, that's al l fo lks! > Jorge? You've got > quite a computer system for crunching numbers -- Is this worth > looking into? Of course it is worth looking into. But you don't need a high-powered number cruncher. If you search the archives, you'll find a Pascal program I wrote that computes letter frequencies in those different positions. I must have written it almost 10 years ago when I had only a state-of-the-art PC, a 386DX running at 33MHz, with a super whopper 8M of RAM. I think that, somewhere in those archives, there are tables of letter frequencies produced by that thing. No, if there is a cipher there I think it can only be a Bacon cipher, but not binary. If gallows were instructions to switch to a different coding wheel, we'd observe strikingly different letter frequencies between gallows, and strikingly different letter groups. We don't.