Folder: MAIL/folders-splitted/vm-folders/voynich-02
From VM Thu Sep 26 23:21:00 2002
Message-Id: <200209270200.g8R20XN27782@xingu.dcc.unicamp.br>
References: <200209250531.g8P5VXh4019526@mail3.alphalink.com.au>
Date: Thu, 26 Sep 2002 23:00:33 -0300 (EST)
From: Jorge Stolfi <stolfi@ic.unicamp.br>
To: voynich@cryptogram.org (Voynich Ms. mailing list)
In-Reply-To: <200209250531.g8P5VXh4019526@mail3.alphalink.com.au>
Subject: VMs:  Re: Piraha and the VMS


    > [Jacques:] It might, only just might do, for Piraha, an
    > Amazonian language with 7 consonants and 3 vowels, ignoring its
    > two tones, and breaking up its consonant clusters, Linear-B
    > style. (Jorge, they're your next-door neighbours, how about...
    > oh, just pulling your leg).

By amazing coincidence, I happen to have a book about the Pirahã
language (which had about 110 speakers left in ~1980).  Here is a 
sample sentence from that book:

  (1) xaíti xaibogi xaigahápiso xisibáobábagaí sagía xabáobihiabá

which the author parses as

  xaíti              peccary 
  xaibogi            quick  
  xaig:ahá:p:i:so    toMove:toGo:IMPERFECTIVE:NEAR:TEMPORAL 
  xisib:áo:b:ábagaí  toShootArrow:TELIC:PERFECTIVE:FRUSTRATED
  sagía              animal
  xab:áo:b:i:hiab:á  toStop:TELIC:PERFECTIVE:EPENTHETIC:NEGATIVE:REMOTE
  
and translates as 

  "while the peccary was fleeing, I almost shoot an arrow at it; it
  didn't stop."
  
As you can see, Pirahã has rather long words, usually made of 
a root with a couple of syllables and several suffixes, which
are often just one syllable or part thereof.  I gather that most 
American native languages follow this pattern, which also
fits Turkish and Hungarian (IIRC).
.
Now, this pattern defintely does not fit the VMS word length
distribution, which is practically zero beyond 10 letters or so.

Jacques suggests that those languages may show a better match to the
VMS, if each word element is written as a separate word, eg.

  (2) xaíti xaibogi xaig ahá p i so xisib áo b ábagaí sagía xab áo b i hiab á

Perhaps... 

However, it seems to me that a full decomposition would have the
opposite problem, namely we would get many more 1- and 2-letter words
than we see in the VMS. So, in order to get a good match, we may have
to assume a partial decomposition, where certain combinations of
suffixes are still written as single words.

Another problem with the "Amerind" theory is that the main roots in
indian languages are often 2 or 3 syllables long. These words would
not have the peculiar structure we see in the VMS words (at most one
gallows, different letters at beginning/middle/end, etc.).

Finally, the idea of writing each suffix as a separate word, as in (2)
above, would be rather peculiar, since all early European
transcriptions of Amerind languages which I have seen wrote them
attached to the root, like (1).

All the best,

--stolfi