Folder: MAIL/folders-splitted/vm-folders/voynich-02
From stolfi Thu Sep 26 23:00:32 -0003 2002
From: Jorge Stolfi <stolfi@dcc.unicamp.br>
To: voynich@cryptogram.org
Subject: Re: Piraha and the VMS
In-Reply-To: <200209250531.g8P5VXh4019526@mail3.alphalink.com.au>
References: <200209250531.g8P5VXh4019526@mail3.alphalink.com.au>
Reply-To: stolfi@dcc.unicamp.br


    > [Jacques:] It might, only just might do, for Piraha, an
    > Amazonian language with 7 consonants and 3 vowels, ignoring its
    > two tones, and breaking up its consonant clusters, Linear-B
    > style. (Jorge, they're your next-door neighbours, how about...
    > oh, just pulling your leg).

By amazing coincidence, I happen to have a book about the Pirahã
language (which had about 110 speakers left in ~1980).  Here is a 
sample sentence from that book:

  (1) xaíti xaibogi xaigahápiso xisibáobábagaí sagía xabáobihiabá

which the author parses as

  xaíti              peccary 
  xaibogi            quick  
  xaig:ahá:p:i:so    toMove:toGo:IMPERFECTIVE:NEAR:TEMPORAL 
  xisib:áo:b:ábagaí  toShootArrow:TELIC:PERFECTIVE:FRUSTRATED
  sagía              animal
  xab:áo:b:i:hiab:á  toStop:TELIC:PERFECTIVE:EPENTHETIC:NEGATIVE:REMOTE
  
and translates as 

  "while the peccary was fleeing, I almost shoot an arrow at it; it
  didn't stop."
  
As you can see, Pirahã has rather long words, usually made of 
a root with a couple of syllables and several suffixes, which
are often just one syllable or part thereof.  I gather that most 
American native languages follow this pattern, which also
fits Turkish and Hungarian (IIRC).

Now, this pattern defintely does not fit the VMS word length
distribution, which is practically zero beyond 10 letters or so.

Jacques suggests that those languages may show a better match to the
VMS, if each word element is written as a separate word, eg.

  (2) xaíti xaibogi xaig ahá p i so xisib áo b ábagaí sagía xab áo b i hiab á

Perhaps... 

However, it seems to me that a full decomposition would have the
opposite problem, namely we would get many more 1- and 2-letter words
than we see in the VMS. So, in order to get a good match, we may have
to assume a partial decomposition, where certain combinations of
suffixes are still written as single words.

Another problem with the "Amerind" theory is that the main roots in
indian languages are often 2 or 3 syllables long. These words would
not have the peculiar structure we see in the VMS words (at most one
gallows, different letters at beginning/middle/end, etc.).

Finally, the idea of writing each suffix as a separate word, as in (2)
above, would be rather peculiar, since all early European
transcriptions of Amerind languages which I have seen wrote them
attached to the root, like (1).

All the best,

--stolfi