From jim@mail.rand.org  Sat Jan  1 05:30:20 2000
Return-Path: <jim@mail.rand.org>
Received: from mail-green.research.att.com (mail-green.research.att.com [135.207.30.103])
	by fry.research.att.com (980427.SGI.8.8.8/8.8.7) with ESMTP id FAA59331
	for <reeds@fry.research.att.com>; Sat, 1 Jan 2000 05:30:20 -0500 (EST)
Received: by mail-green.research.att.com (Postfix)
	id EEB931E04A; Sat,  1 Jan 2000 05:09:36 -0500 (EST)
Delivered-To: reeds@research.att.com
Received: from mail01-lax.pilot.net (mail-lax-1.pilot.net [205.139.40.18])
	by mail-green.research.att.com (Postfix) with ESMTP id 1BCE01E02A
	for <reeds@research.att.com>; Sat,  1 Jan 2000 05:09:35 -0500 (EST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail01-lax.pilot.net with ESMTP id CAA11292; Sat, 1 Jan 2000 02:09:31 -0800 (PST)
Received: from mail.rand.org (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id CAA03482; Sat, 1 Jan 2000 02:09:30 -0800 (PST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail.rand.org (8.9.3/8.9.3) with ESMTP id CAA03770 for <voynich@rand.org>; Sat, 1 Jan 2000 02:07:49 -0800 (PST)
Received: from mail01-lax.pilot.net (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id CAA03465 for <voynich@rand.org>; Sat, 1 Jan 2000 02:07:48 -0800 (PST)
Received: from mailout06.sul.t-online.de (mailout06.sul.t-online.de [194.25.134.19]) by mail01-lax.pilot.net with ESMTP id CAA11229 for <voynich@rand.org>; Sat, 1 Jan 2000 02:07:47 -0800 (PST)
Received: from fwd04.sul.t-online.de 
	by mailout06.sul.t-online.de with smtp 
	id 124LRq-0006Hg-00; Sat, 1 Jan 2000 11:07:46 +0100
Received: from  (0625764225-0001@[62.156.12.24]) by fwd04.sul.t-online.de
	with smtp id 124LRg-1kyahcC; Sat, 1 Jan 2000 11:07:36 +0100
From: Zandbergen@t-online.de (Rene)
To: voynich@rand.org
References: <Pine.GSO.4.20.9912311318370.29804-100000@haywire.csuhayward.edu>
Subject: Re: Discovery Channel program
X-Mailer: T-Online eMail 2.3
MIME-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 8BIT
Date: Sat, 1 Jan 2000 11:07:36 +0100
Message-ID: <124LRg-1kyahcC@fwd04.sul.t-online.de>
X-Sender: 0625764225-0001@t-dialin.net
Sender: jim@mail.rand.org
Status: OR

Dan Moonhawk Alford wrote:

> Susan Tsantiris talked about it, then Robert Brumbaugh Jr 
> was shown continuing his Yale Prof. father's work on the Ms.
> (Are neither of these people on our List?)

Who is Susan Tsantiris?
In any case, I once asked Robert Babcock if he had contact info
of Robert Conrad Brumbaugh (Jr.) and he said that he didn't.
There is a daughter around as well, but she married and Babcock
didn't know here new family name. She (as opposed to her brother)
still lives in the New Haven area.

My reason for seeking contact was that they might still have in
their possession images of folio 1r with or without the application
of chemicals. These might reveal the additional 'hidden' information
on this page: the year nr (1*30), the word 'Prag' near the signature
and the alphabet tables.

> Following his father's "key," Brumbaugh matched 14 distinct 
> "letters" to numeric values; he says he found "arithmetic 
> problems" in the marginal notes of one of the folios 
> (which?) that helped confirm the numeric key. Reconstructing
> the label near "one of his favorite pictures," a plant, 
> using the numeruc key -- which lays out numbers with the 
> English alphabet -- 7-5-7-7-5-2, he gets "p-e-p-p-er," which
> could be the name of the plant pictured.  He read "more and 
> more" of it this way. Voila! Case solved.

This is all his father's work. I don't recognise anything new.
Except perhaps, that the plant name was originally decoded as
'pepperquoqus' and Jr. just left out the 'quoqus'. Another plant
apparently was called pepperhelayc.

> So this life's work of Brumbaugh, continued by his son, 
> makes incredibly little sense to me on the face of it. Can 
> anyone straighten me out on this, or comment?

In addition to Jim's remarks, the weakness of Brumbaugh's case
never becomes more apparent than in his decoding of star names
in the zodiac section.

Cheers,
   Rene (and welcome all in a new era in Voynich MS decoding!!)

From reeds Sat Jan  1 20:44:21 2000
From: reeds@fry.research.att.com (Jim Reeds)
Message-Id: <1000101204421.ZM2399335@fry.research.att.com>
Date: Sat, 1 Jan 2000 20:44:21 -0500
X-Mailer: Z-Mail (4.0.1 13Jan97)
To: voynich@rand.org
Subject: 1999 VMS archived email traffic
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Status: OR

I have just put all the 1999 voynich@rand.org email traffic
on my web site, in a 1.36 Mb zip file 

	http://www.research.att.com/~reeds/voynich/99.zip

which can also be reached via

	http://wwwc.research.att.com/~reeds/voynich/vmail.html

As always, the coverage is only partially correct: I might not
have received all postings, might have deleted some by mistake,
etc, dropped attachments along the way, etc.  (I did remove 2
posts which the authors withdrew soon after posting.)

-- 
Jim Reeds, AT&T Labs - Research
Shannon Laboratory, Room C229, Building 103
180 Park Avenue, Florham Park, NJ 07932-0971, USA

reeds@research.att.com, phone: +1 973 360 8414, fax: +1 973 360 8178

From jim@mail.rand.org  Sun Jan  2 23:30:44 2000
Return-Path: <jim@mail.rand.org>
Received: from mail-blue.research.att.com (mail-blue.research.att.com [135.207.30.102])
	by fry.research.att.com (980427.SGI.8.8.8/8.8.7) with ESMTP id XAA75091
	for <reeds@fry.research.att.com>; Sun, 2 Jan 2000 23:30:43 -0500 (EST)
Received: by mail-blue.research.att.com (Postfix)
	id C7C5A4CE08; Sun,  2 Jan 2000 23:30:43 -0500 (EST)
Delivered-To: reeds@research.att.com
Received: from mail01-lax.pilot.net (mail-lax-1.pilot.net [205.139.40.18])
	by mail-blue.research.att.com (Postfix) with ESMTP id 2DC5D4CE06
	for <reeds@research.att.com>; Sun,  2 Jan 2000 23:30:43 -0500 (EST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail01-lax.pilot.net with ESMTP id UAA29209; Sun, 2 Jan 2000 20:30:35 -0800 (PST)
Received: from mail.rand.org (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id UAA01128; Sun, 2 Jan 2000 20:30:34 -0800 (PST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail.rand.org (8.9.3/8.9.3) with ESMTP id UAA06014 for <voynich@rand.org>; Sun, 2 Jan 2000 20:30:02 -0800 (PST)
Received: from mail03-lax.pilot.net (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id UAA01112 for <voynich@rand.org>; Sun, 2 Jan 2000 20:30:01 -0800 (PST)
Received: from yarf.eecs.umich.edu (yarf.eecs.umich.edu [141.213.12.211]) by mail03-lax.pilot.net with ESMTP id UAA26853 for <voynich@rand.org>; Sun, 2 Jan 2000 20:30:00 -0800 (PST)
Received: (from kckluge@localhost)
	by yarf.eecs.umich.edu (8.9.3/8.9.1) id XAA02203;
	Sun, 2 Jan 2000 23:29:58 -0500 (EST)
Date: Sun, 2 Jan 2000 23:29:58 -0500 (EST)
Message-Id: <200001030429.XAA02203@yarf.eecs.umich.edu>
From: Karl Kluge <kckluge@eecs.umich.edu>
To: voynich@rand.org
Subject: Extext of "True and Faithful Relation," Sloane 3188?
Sender: jim@mail.rand.org
Status: OR


Howdy, and Happy New Year! Much to my relief, the Y2K bug doesn't appear 
to be causing the collapse of industrial society (I don't think I'd have
a lot of marketable job skills in a post-Apocalyptic economy), so it's
back to thinking about the Voynich.

One of the vexing things is that while circumstantial evidence seems to
favor Dee's ownership, he doesn't appear to mention it at all. After all,
he explicitly asks the angels for help with the tables in the "Book of
Soyga," so it seems odd that he wouldn't ask about the Voynich. Has anyone
read the "True and Faithful Relation" with a careful eye to possible
references? 

Alternatively, does anyone have etexts of TFR and the earlier workings in
Sloane 3188? Some carefully thought out grepping might help double check
that nothing was missed. 

Karl

From jim@mail.rand.org  Mon Jan  3 03:15:43 2000
Return-Path: <jim@mail.rand.org>
Received: from mail-green.research.att.com (mail-green.research.att.com [135.207.30.103])
	by fry.research.att.com (980427.SGI.8.8.8/8.8.7) with ESMTP id DAA70430
	for <reeds@fry.research.att.com>; Mon, 3 Jan 2000 03:15:43 -0500 (EST)
Received: by mail-green.research.att.com (Postfix)
	id E6DD01E022; Mon,  3 Jan 2000 03:15:42 -0500 (EST)
Delivered-To: reeds@research.att.com
Received: from mail03-lax.pilot.net (mail-lax-3.pilot.net [205.139.40.17])
	by mail-green.research.att.com (Postfix) with ESMTP id 755D71E021
	for <reeds@research.att.com>; Mon,  3 Jan 2000 03:15:42 -0500 (EST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail03-lax.pilot.net with ESMTP id AAA12080; Mon, 3 Jan 2000 00:15:16 -0800 (PST)
Received: from mail.rand.org (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id AAA04380; Mon, 3 Jan 2000 00:15:15 -0800 (PST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail.rand.org (8.9.3/8.9.3) with ESMTP id AAA10405 for <voynich@rand.org>; Mon, 3 Jan 2000 00:15:05 -0800 (PST)
Received: from mail01-lax.pilot.net (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id AAA04366 for <voynich@rand.org>; Mon, 3 Jan 2000 00:15:04 -0800 (PST)
Received: from ns1.ovis.net (ns1.ovis.net [207.0.147.2]) by mail01-lax.pilot.net with ESMTP id AAA14385 for <voynich@rand.org>; Mon, 3 Jan 2000 00:15:03 -0800 (PST)
Received: from ovis.net (s32.pm5.ovis.net [207.0.147.98])
	by ns1.ovis.net (8.9.3/8.9.3) with ESMTP id DAA12794;
	Mon, 3 Jan 2000 03:14:54 -0500
Message-ID: <38705A84.77CF0F9@ovis.net>
Date: Mon, 03 Jan 2000 03:15:00 -0500
From: Steve Kudlak <chromexa@ovis.net>
Reply-To: chromexa@ovis.net
X-Mailer: Mozilla 4.5 [en]C-CCK-MCD ezn/58/n  (Win98; U)
X-Accept-Language: en
MIME-Version: 1.0
To: Karl Kluge <kckluge@eecs.umich.edu>
Cc: voynich@rand.org
Subject: Re: Extext of "True and Faithful Relation," Sloane 3188?
References: <200001030429.XAA02203@yarf.eecs.umich.edu>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Sender: jim@mail.rand.org
Status: OR



Karl Kluge wrote:

> Howdy, and Happy New Year! Much to my relief, the Y2K bug doesn't appear
> to be causing the collapse of industrial society (I don't think I'd have
> a lot of marketable job skills in a post-Apocalyptic economy), so it's
> back to thinking about the Voynich.
>
> One of the vexing things is that while circumstantial evidence seems to
> favor Dee's ownership, he doesn't appear to mention it at all. After all,
> he explicitly asks the angels for help with the tables in the "Book of
> Soyga," so it seems odd that he wouldn't ask about the Voynich. Has anyone
> read the "True and Faithful Relation" with a careful eye to possible
> references?
>
> Alternatively, does anyone have etexts of TFR and the earlier workings in
> Sloane 3188? Some carefully thought out grepping might help double check
> that nothing was missed.
>
> Karl

I and my friend Partick Muir Scheible at University of Washington essayed
this. It was on Microfilm and I had to keep switching lesnes. It is all very
vague, as far as I remember. My copies of Microfilm were destoryed in a flood.
I wrote the British Library and was neber able to get more copies. There were
3 Slone MSS, and I have forgotten all the numbers. These are the ones we
loaded into a photogrpher enlarger and produced enlargements fropm which we
made photocopies as a cost effective measures. It is very hard to detect
anything that directly point to the VMS.

If there is an ETEXT I would be willing to try again.with more modern
technology. Dee and one must add Kelley clearly working with a pretty clear
text.inspite of Enochian and all that. But all that being said a a "try, try,
again..." might yet bear some fruite. Are there ETEXTS floating around.

Have Fun,
Sends Steve


From jim@mail.rand.org  Mon Jan  3 12:49:31 2000
Return-Path: <jim@mail.rand.org>
Received: from mail-green.research.att.com (mail-green.research.att.com [135.207.30.103])
	by fry.research.att.com (980427.SGI.8.8.8/8.8.7) with ESMTP id MAA75280
	for <reeds@fry.research.att.com>; Mon, 3 Jan 2000 12:49:31 -0500 (EST)
Received: by mail-green.research.att.com (Postfix)
	id E97BD1E00D; Mon,  3 Jan 2000 12:49:30 -0500 (EST)
Delivered-To: reeds@research.att.com
Received: from mail01-lax.pilot.net (mail-lax-1.pilot.net [205.139.40.18])
	by mail-green.research.att.com (Postfix) with ESMTP id 392E11E00C
	for <reeds@research.att.com>; Mon,  3 Jan 2000 12:49:30 -0500 (EST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail01-lax.pilot.net with ESMTP id JAA03426; Mon, 3 Jan 2000 09:49:23 -0800 (PST)
Received: from mail.rand.org (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id JAA22980; Mon, 3 Jan 2000 09:49:22 -0800 (PST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail.rand.org (8.9.3/8.9.3) with ESMTP id JAA06920 for <voynich@rand.org>; Mon, 3 Jan 2000 09:49:06 -0800 (PST)
Received: from mail03-lax.pilot.net (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id JAA22951 for <voynich@rand.org>; Mon, 3 Jan 2000 09:49:05 -0800 (PST)
Received: from sylt.pixelpark.com (sylt.pixelpark.com [62.52.66.77]) by mail03-lax.pilot.net with ESMTP id JAA28802 for <voynich@rand.org>; Mon, 3 Jan 2000 09:49:04 -0800 (PST)
Received: from pixelpark.com (hh-pc-68-111.pixelpark.com [62.52.68.111])
	by sylt.pixelpark.com (8.9.3/8.9.3) with ESMTP id SAA06407
	for <voynich@rand.org>; Mon, 3 Jan 2000 18:49:02 +0100 (MET)
Message-ID: <3870E0A7.2C579C88@pixelpark.com>
Date: Mon, 03 Jan 2000 18:47:19 +0100
From: Andreas Wilhelm <wilhelm@pixelpark.com>
X-Mailer: Mozilla 4.7 [en] (Win95; I)
X-Accept-Language: de-DE
MIME-Version: 1.0
To: Voynich Mailing List <voynich@rand.org>
Subject: Request For Status: Language vs. Cipher / Facsimile Edition
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Sender: jim@mail.rand.org
Status: OR

Hi All,

I suppose no-body is really interested in in-depth epic details about
who I am and what I care for ;-) , but  just for the records: I'm the
"new guy" on the list (did anybody notice except for Jim Gillogly?). My
name is Andreas, I working as a producer at a multimedia agency in
Hamburg. Cheers!

I have been reading about the Voynich Manuscript for quite a while and
checked the various websites, documents and reports. I am quite honored
to find myself on this list and thereby be in direct contact to all
these people, Reeds, Landini, Stolfi, Guy, Zandbergen - only to name a
few!

I am currently trying to wade throught the whole of Jim's compiled
mailing list traffic (which is a great thing with which to start off!),
however to shorten my efforts it would be great if I could get a
summarised status of discussion on two issues:

Language vs. Cipher
What are main arguments for the thesis that MS408 is written in cipher,
and what are the main arguments for it being a (maybe artifical)
language?
I understand that Newbold's "decipherment" has been identified as
insufficient (read "wrong") at the best. Brumbaugh does also seem to
walking in these footsteps. Also a "Glen" (writing from
"cryptography@home.com") is strongly argueing in favour of
"decipherment". There are a great many papers on entropy and Zipf's
laws, but is there also already a summarized (somewhat final) conclusion
of the findings? Are there people actively working on a "decipherment"
of the book or are people mainly envolved in the structural and
linguistic (semantic) analysis?
Who are the "key-players" in this discussion (if I may ask so bluntly)?
btw: Is the text of Newbold's "decipherment" and possibly the
counter-arguments available online or digitally?

CD-ROM / Wishlist - Facsimile production
There have been numerous statements on how badly needed a high-quality
edition of MS408 would be. People have been exchanging photos,
photocopies and GIFs. There have been wishlists, hopes for a CD-ROM
edition, pricelistings from the Yale Library (for photos) and so forth.
I would like to know what the status of this issue is. How many percent
(which pages) are available in what quality so far? Is there anyone
envolved in organising or in the production of a high-quality
reproduction of the whole MS408?
I am asking this, not because I wish to start my own Voynich collection,
but because I am about to start this as a project; the production and
publication of a high-quality facsimile edition of MS408 (comparable to
the Thames&Hudson hardover slip-case edition of the "Book of Kells"). I
don't wish to interfere with any other on-going projects or
communications/negotiations with Yale, therefore I would like to know
the exact status on this.

All the best for the start of the new Mill... NewYear (successfully
avoiding the M-word) and cheers from Hamburg,

Andreas
--


Pixelpark AG. . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Andreas Wilhelm  .  Producer
Schulterblatt 58  .  20357 Hamburg  .  Germany
phone: + 49 40 432 03 - 37  .   fax: - 20

http://www.pixelpark.com


From jim@mail.rand.org  Mon Jan  3 14:43:38 2000
Return-Path: <jim@mail.rand.org>
Received: from mail-green.research.att.com (mail-green.research.att.com [135.207.30.103])
	by fry.research.att.com (980427.SGI.8.8.8/8.8.7) with ESMTP id OAA45500
	for <reeds@fry.research.att.com>; Mon, 3 Jan 2000 14:43:37 -0500 (EST)
Received: by mail-green.research.att.com (Postfix)
	id 6E0BE1E00C; Mon,  3 Jan 2000 14:43:37 -0500 (EST)
Delivered-To: reeds@research.att.com
Received: from mail02-lax.pilot.net (mail-lax-2.pilot.net [205.139.40.16])
	by mail-green.research.att.com (Postfix) with ESMTP id DC7581E00B
	for <reeds@research.att.com>; Mon,  3 Jan 2000 14:43:36 -0500 (EST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail02-lax.pilot.net with ESMTP id LAA27145; Mon, 3 Jan 2000 11:43:32 -0800 (PST)
Received: from mail.rand.org (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id LAA02375; Mon, 3 Jan 2000 11:43:31 -0800 (PST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail.rand.org (8.9.3/8.9.3) with ESMTP id LAA23602 for <voynich@rand.org>; Mon, 3 Jan 2000 11:43:09 -0800 (PST)
Received: from mail01-lax.pilot.net (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id LAA02354 for <voynich@rand.org>; Mon, 3 Jan 2000 11:43:09 -0800 (PST)
Received: from mailer3.bham.ac.uk (mailer3.bham.ac.uk [147.188.128.54]) by mail01-lax.pilot.net with ESMTP id LAA17375 for <voynich@rand.org>; Mon, 3 Jan 2000 11:43:08 -0800 (PST)
Received: from bham.ac.uk ([147.188.128.127])
	by mailer3.bham.ac.uk with esmtp (Exim 3.02 #16)
	id 125DNi-0003Ji-00
	for voynich@rand.org; Mon, 03 Jan 2000 19:43:06 +0000
Received: from is-fs13.bham.ac.uk ([147.188.127.26])
	by bham.ac.uk with esmtp (Exim 3.10 #1)
	id 125DNi-0000db-04
	for voynich@rand.org; Mon, 03 Jan 2000 19:43:06 +0000
Received: from BHAM-IS-FS13/SpoolDir by is-fs13.bham.ac.uk (Mercury 1.44);
    3 Jan 00 19:43:24 +0000
Received: from SpoolDir by BHAM-IS-FS13 (Mercury 1.44); 3 Jan 00 19:43:20 +0000
Received: from oemcomputer (147.188.137.1) by is-fs13.bham.ac.uk (Mercury 1.44) with ESMTP;
    3 Jan 00 19:43:13 +0000
From: "Gabriel Landini" <G.Landini@bham.ac.uk>
Organization: The University of Birmingham, U.K.
To: voynich@rand.org
Date: Mon, 3 Jan 2000 19:43:06 -0000
MIME-Version: 1.0
Content-type: text/plain; charset=US-ASCII
Content-transfer-encoding: 7BIT
Subject: VMS in the Fortean Times
Reply-To: G.Landini@bham.ac.uk
X-mailer: Pegasus Mail for Win32 (v3.12b)
Message-ID: <10F92AB7DDA@is-fs13.bham.ac.uk>
Sender: jim@mail.rand.org
Status: OR

Hi all,
Happy new year. In the Jan 2000 of the Fortean Times (UK) thereis 
an article by Mike Jay (?) on the vms called "Maze of madness".
The article has several small pictures: f88r, f78r (whole page, the 
only "readable"), f68v1-f69r and f33v-f34r and engravings of the 
portraits of Bacon, Kircher and Dee.
The review is Ok although it lacks all the new details on the 
ownership of the manuscript that Rene has discovered recently.
The article states about 3 times that "many have gone mad in the 
process" which I wonder whether it is correct -- the author does not 
substantiate the claim anywhere.

I briefly mentions the solution claims by Newbold, Strong, Stojko, 
Levitov, and Brumbaugh. 

The rest of the magazine is of very dubious content, so I have the 
feeling that the "going mad in the search" claims are for attracting 
audience purposes.

Cheers,

Gabriel

From reeds Mon Jan  3 18:25:31 2000
From: reeds@fry.research.att.com (Jim Reeds)
Message-Id: <1000103182531.ZM2626255@fry.research.att.com>
Date: Mon, 3 Jan 2000 18:25:31 -0500
In-Reply-To: Andreas Wilhelm <wilhelm@pixelpark.com>
        "Request For Status: Language vs. Cipher / Facsimile Edition" (Jan  3, 18:47)
References: <3870E0A7.2C579C88@pixelpark.com>
X-Mailer: Z-Mail (4.0.1 13Jan97)
To: Andreas Wilhelm <wilhelm@pixelpark.com>, 
 Voynich Mailing List <voynich@rand.org>
Subject: Re: Request For Status: Language vs. Cipher / Facsimile Edition
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Status: OR

Welcome, Andreas!

Let me give very brief answers to your two questions, even
though they deserve long answers.

1.  Language vs. cipher.  The VMS text shows more patterning and
less randomness than languages or ciphers typically show.
Modern ciphers tend to show much less patterning and more randomness
than languages and the simplest letter substitution ciphers show
amounts as languages do.  So we suppose the VMS is written in
some special kind of language (a mad-man's, or formulaic, or
spelled out with a special orthography) to account for the
discrepancy.

2. Wish list & photo copies. We are currently negotiating with
Yale for a high-class digital scan. I'd like to not comment on
details & terms just yet, except to remark: (a) if these
negotiations are successful, I think your project will be
facilitated, and (b) I'd be grateful if you'd hold off approaching
Yale for a month or so, until after our negotiations have come to a
conclusion one way or the other.

Jim

-- 
Jim Reeds, AT&T Labs - Research
Shannon Laboratory, Room C229, Building 103
180 Park Avenue, Florham Park, NJ 07932-0971, USA

reeds@research.att.com, phone: +1 973 360 8414, fax: +1 973 360 8178

From jim@mail.rand.org  Tue Jan  4 07:48:10 2000
Return-Path: <jim@mail.rand.org>
Received: from mail-green.research.att.com (mail-green.research.att.com [135.207.30.103])
	by fry.research.att.com (980427.SGI.8.8.8/8.8.7) with ESMTP id HAA20114
	for <reeds@fry.research.att.com>; Tue, 4 Jan 2000 07:48:10 -0500 (EST)
Received: by mail-green.research.att.com (Postfix)
	id EDFF81E0A9; Tue,  4 Jan 2000 04:34:55 -0500 (EST)
Delivered-To: reeds@research.att.com
Received: from mail01-lax.pilot.net (mail-lax-1.pilot.net [205.139.40.18])
	by mail-green.research.att.com (Postfix) with ESMTP id 705B01E0A8
	for <reeds@research.att.com>; Tue,  4 Jan 2000 04:34:55 -0500 (EST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail01-lax.pilot.net with ESMTP id BAA14074; Tue, 4 Jan 2000 01:34:48 -0800 (PST)
Received: from mail.rand.org (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id BAA06923; Tue, 4 Jan 2000 01:34:47 -0800 (PST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail.rand.org (8.9.3/8.9.3) with ESMTP id BAA26705 for <voynich@rand.org>; Tue, 4 Jan 2000 01:34:26 -0800 (PST)
Received: from mail02-lax.pilot.net (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id BAA06910 for <voynich@rand.org>; Tue, 4 Jan 2000 01:34:25 -0800 (PST)
Received: from grande.dcc.unicamp.br (grande.dcc.unicamp.br [143.106.7.8]) by mail02-lax.pilot.net with ESMTP id BAA14948 for <voynich@rand.org>; Tue, 4 Jan 2000 01:34:23 -0800 (PST)
Received: from amazonas.dcc.unicamp.br (amazonas.dcc.unicamp.br [143.106.7.11])
	by grande.dcc.unicamp.br (8.9.3/8.9.3) with ESMTP id HAA20242
	for <voynich@rand.org>; Tue, 4 Jan 2000 07:34:13 -0200 (EDT)
Received: from coruja.dcc.unicamp.br (coruja.dcc.unicamp.br [143.106.24.80])
	by amazonas.dcc.unicamp.br (8.8.5/8.8.5) with ESMTP id HAA17863
	for <voynich@rand.org>; Tue, 4 Jan 2000 07:34:11 -0200 (EDT)
Received: (from stolfi@localhost)
	by coruja.dcc.unicamp.br (8.8.5/8.8.5) id HAA01682;
	Tue, 4 Jan 2000 07:34:11 -0200 (EDT)
Date: Tue, 4 Jan 2000 07:34:11 -0200 (EDT)
Message-Id: <200001040934.HAA01682@coruja.dcc.unicamp.br>
From: Jorge Stolfi <stolfi@dcc.unicamp.br>
To: voynich@rand.org
Subject: Re: Request For Status: Language vs. Cipher
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
Content-Type: text/plain; charset=iso-8859-1
In-Reply-To: <3870E0A7.2C579C88@pixelpark.com>
References: <3870E0A7.2C579C88@pixelpark.com>
Reply-To: stolfi@dcc.unicamp.br
Sender: jim@mail.rand.org
Status: OR


    > [Andreas Wilhelm:] Who are the "key-players" in this discussion
    > (if I may ask so bluntly)?

I seem to be the most vocal defender of "natural language" theory,
although not its originator.

    > What are main arguments for the thesis that MS408 is written in
    > cipher, and what are the main arguments for it being a (maybe
    > artifical) language?
    
My arguments were posted to the list, and some can be found in my web
pages.

In short: I don't believe it is a "common" language in cipher, because
a simple cipher would have been cracked by now, and a complex cipher
should not preserve (or produce) the language-like statistics and
structures that we see in the VMS. Moreover, it is hard to imagine why
the author would have wanted to encrypt a whole 250-page book,
while devoting 2/3 of the space to illustrations.

Also, I don't believe it is purely random text (glossolalia, madman's
drivel, etc.), because it has too much structure and homogeneity, and
I cannot see how those features could be faked (or even perceived) by
a 15th century author.

I cannot believe it is a fraud, either, because the feeling is all
wrong: it would be like counterfeiting a 3-cent coin. A scholarly
hoax, say on Baresch or Kircher, is somewhat more likely; but that too
has its problems.

So, almost by exclusion, my current opinion is that Voynichese is a
rather straightforward encoding (e.g. a phonetic transcription) of
some "exotic" language; and the structures that we see in it are
basically those of the language itself. 

If that assumtion is true, the structure of the "words" seems to imply
that they are actually syllables. Moreover, since most labels are
single "words", the language must be monosyllabic; and since there is
a large number of distinct "words", the language is probably tonal --
which would mean an East Asian language, such as Tibetan, Chinese,
Vietnamese etc.

The "exotic language" theory seems quite plausible historically
(indeed Baresch himself apparently believed in it), and seems to
explain many features of the VMS that are hard to explain otherwise.
For instance, why the text does not include any words in "classical"
scripts (Roman, Greek, or Hebrew), why there are no number-like
symbols, why we don't see any grammatical structure, why the plants
and cosmology look so alien, etc..

An invented language is also a possibility, of course. However, it
seems that invented languages are either utterly logical, and hence
utterly unnatural (like Dalgarno's, if I got it right, or
Loglan/Lojban); or quite similar to the natural languages known to the
inventor (like Hildegarde's, Esperanto, Enochian, Klingon, etc.). Now
Voynichese seems too irregular to be a "logical" invented language, and too
bizarre to be calqued on European or Semitic language.  In other words,
if it is an invented language, the inventor must have modeled
it after some "exotic" language...

    > There are a great many papers on entropy and Zipf's laws, but is
    > there also already a summarized (somewhat final) conclusion of
    > the findings?

Those studies generally show that there is nothing terribly wrong
with Voynichese as a natural language, although it doesn't seem
to be a standard one.

Some statistics change (sometimes radically) from section to section,
while others, including the basic "word" structure, are surprisingly
constant.  There is good evidence that the pages and sections were
bound in the wrong order.

The entropy studies are inconclusive, since entropy is a property  
of the encodingand not of the underlying language -- and we don't 
even know what are the letters of the alphabet.

    > Are there people actively working on a "decipherment" of the
    > book or are people mainly envolved in the structural and
    > linguistic (semantic) analysis?

I think that both teams have many players. However, most people
working on decipherment seem to be assuming a complex code. Under that
approach, there is not much chance of doing "a little progress" --
either you crack the code, or you stay stuck at the beginning. So the
mailing list tends to be dominated by linguistic and historical
speculation, and occasional bean-counting.

    > Is the text of Newbold's "decipherment" and possibly the
    > counter-arguments available online or digitally?

I don't know whether the text is available through the net. (In fact I
don't know whether he deciphered more than a few sentences.)

Some of the couter-arguments have been quoted in the mailing list,
over the last few years.

All the best,

--stolfi

From jim@mail.rand.org  Tue Jan  4 08:49:04 2000
Return-Path: <jim@mail.rand.org>
Received: from mail-green.research.att.com (mail-green.research.att.com [135.207.30.103])
	by fry.research.att.com (980427.SGI.8.8.8/8.8.7) with ESMTP id IAA27370
	for <reeds@fry.research.att.com>; Tue, 4 Jan 2000 08:49:04 -0500 (EST)
Received: by mail-green.research.att.com (Postfix)
	id 7C0751E012; Tue,  4 Jan 2000 08:49:04 -0500 (EST)
Delivered-To: reeds@research.att.com
Received: from mail02-lax.pilot.net (mail-lax-2.pilot.net [205.139.40.16])
	by mail-green.research.att.com (Postfix) with ESMTP id DE48F1E002
	for <reeds@research.att.com>; Tue,  4 Jan 2000 08:49:03 -0500 (EST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail02-lax.pilot.net with ESMTP id FAA11968; Tue, 4 Jan 2000 05:48:59 -0800 (PST)
Received: from mail.rand.org (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id FAA12907; Tue, 4 Jan 2000 05:48:58 -0800 (PST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail.rand.org (8.9.3/8.9.3) with ESMTP id FAA01751 for <voynich@rand.org>; Tue, 4 Jan 2000 05:48:44 -0800 (PST)
Received: from mail02-lax.pilot.net (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id FAA12894 for <voynich@rand.org>; Tue, 4 Jan 2000 05:48:44 -0800 (PST)
Received: from mailer3.bham.ac.uk (mailer3.bham.ac.uk [147.188.128.54]) by mail02-lax.pilot.net with ESMTP id FAA11935 for <voynich@rand.org>; Tue, 4 Jan 2000 05:48:43 -0800 (PST)
Received: from bham.ac.uk ([147.188.128.127])
	by mailer3.bham.ac.uk with esmtp (Exim 3.02 #16)
	id 125UKI-0007Cw-00
	for voynich@rand.org; Tue, 04 Jan 2000 13:48:42 +0000
Received: from is-fs13.bham.ac.uk ([147.188.127.26])
	by bham.ac.uk with esmtp (Exim 3.10 #1)
	id 125UKH-000605-00
	for voynich@rand.org; Tue, 04 Jan 2000 13:48:42 +0000
Received: from BHAM-IS-FS13/SpoolDir by is-fs13.bham.ac.uk (Mercury 1.44);
    4 Jan 00 13:49:00 +0000
Received: from SpoolDir by BHAM-IS-FS13 (Mercury 1.44); 4 Jan 00 13:48:35 +0000
Received: from oemcomputer (147.188.135.6) by is-fs13.bham.ac.uk (Mercury 1.44) with ESMTP;
    4 Jan 00 13:48:28 +0000
From: "Gabriel Landini" <G.Landini@bham.ac.uk>
Organization: The University of Birmingham, U.K.
To: voynich@rand.org
Date: Tue, 4 Jan 2000 13:48:27 -0000
MIME-Version: 1.0
Content-type: text/plain; charset=US-ASCII
Content-transfer-encoding: 7BIT
Subject: Codex Serafinianus (sp?)
Reply-To: G.Landini@bham.ac.uk
X-mailer: Pegasus Mail for Win32 (v3.12b)
Message-ID: <121A9702EC4@is-fs13.bham.ac.uk>
Sender: jim@mail.rand.org
Status: OR

Hi, 
Can anybody point out to the comment (long ago) about the 
numeric system in the codex Serafinianus (spell?) being on base 
31?
Who and how was this cracked?

Thanks

Gabriel

From jim@mail.rand.org  Tue Jan  4 12:11:47 2000
Return-Path: <jim@mail.rand.org>
Received: from mail-blue.research.att.com (mail-blue.research.att.com [135.207.30.102])
	by fry.research.att.com (980427.SGI.8.8.8/8.8.7) with ESMTP id MAA62939
	for <reeds@fry.research.att.com>; Tue, 4 Jan 2000 12:11:46 -0500 (EST)
Received: by mail-blue.research.att.com (Postfix)
	id CA1284CE0A; Tue,  4 Jan 2000 12:11:46 -0500 (EST)
Delivered-To: reeds@research.att.com
Received: from mail01-lax.pilot.net (mail-lax-1.pilot.net [205.139.40.18])
	by mail-blue.research.att.com (Postfix) with ESMTP id 548924CE02
	for <reeds@research.att.com>; Tue,  4 Jan 2000 12:11:46 -0500 (EST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail01-lax.pilot.net with ESMTP id JAA08695; Tue, 4 Jan 2000 09:11:42 -0800 (PST)
Received: from mail.rand.org (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id JAA26260; Tue, 4 Jan 2000 09:11:40 -0800 (PST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail.rand.org (8.9.3/8.9.3) with ESMTP id JAA20414 for <voynich@rand.org>; Tue, 4 Jan 2000 09:11:27 -0800 (PST)
Received: from mail03-lax.pilot.net (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id JAA26230 for <voynich@rand.org>; Tue, 4 Jan 2000 09:11:26 -0800 (PST)
Received: from mailout03.sul.t-online.de (mailout03.sul.t-online.de [194.25.134.81]) by mail03-lax.pilot.net with ESMTP id JAA11333 for <voynich@rand.org>; Tue, 4 Jan 2000 09:11:25 -0800 (PST)
Received: from fwd07.sul.t-online.de 
	by mailout03.sul.t-online.de with smtp 
	id 125XUS-000693-06; Tue, 4 Jan 2000 18:11:24 +0100
Received: from  (0625764225-0001@[193.159.4.96]) by fwd07.sul.t-online.de
	with smtp id 125XUN-18AloOC; Tue, 4 Jan 2000 18:11:19 +0100
From: Zandbergen@t-online.de (Rene)
To: voynich@rand.org
References: <121A9702EC4@is-fs13.bham.ac.uk>
Subject: Re: Codex Serafinianus (sp?)
X-Mailer: T-Online eMail 2.3
MIME-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 8BIT
Date: Tue, 4 Jan 2000 18:11:19 +0100
Message-ID: <125XUN-18AloOC@fwd07.sul.t-online.de>
X-Sender: 0625764225-0001@t-dialin.net
Sender: jim@mail.rand.org
Status: ORr

Gabriel wrote:
 
> Can anybody point out to the comment (long ago) about the 
> numeric system in the codex Serafinianus (spell?) being on base 
> 31?
> Who and how was this cracked?

In a 1991 post, a member called Ron Hale-Evans wrote: 

> Reading through the digests on rand.org, I notice that someone
> mentioned the Codex Seraphinianus and asked if anyone had done
> any work on it. I have managed to largely decode the numbering
> system at the bottom of the pages. It is not a number-place system
> like Arabic or binary. It works more like Roman numerals. I can
> supply some more information if anyone wishes.

I did not immediately find the earlier reference, nor any follow-up.

Cheers, Rene

From reeds Tue Jan  4 12:21:13 2000
From: reeds@fry.research.att.com (Jim Reeds)
Message-Id: <1000104122113.ZM2772483@fry.research.att.com>
Date: Tue, 4 Jan 2000 12:21:13 -0500
In-Reply-To: Zandbergen@t-online.de (Rene)
        "Re: Codex Serafinianus (sp?)" (Jan  4, 18:11)
References: <121A9702EC4@is-fs13.bham.ac.uk> 
	<125XUN-18AloOC@fwd07.sul.t-online.de>
X-Mailer: Z-Mail (4.0.1 13Jan97)
To: voynich@rand.org
Subject: Re: Codex Serafinianus (sp?)
MIME-Version: 1.0
Content-Type: text/plain; charset=iso-8859-1
Content-Transfer-Encoding: quoted-printable
Status: OR

On Jan 4, 18:11, Rene wrote:

> Subject: Re: Codex Serafinianus (sp?)
=2E..
> In a 1991 post, a member called Ron Hale-Evans wrote: =

> =

> > Reading through the digests on rand.org, I notice that someone
> > mentioned the Codex Seraphinianus and asked if anyone had done
> > any work on it. I have managed to largely decode the numbering
> > system at the bottom of the pages. It is not a number-place system
> > like Arabic or binary. It works more like Roman numerals. I can
> > supply some more information if anyone wishes.
> =

> I did not immediately find the earlier reference, nor any follow-up.
> =

> Cheers, Rene
>-- End of excerpt from Rene

On 21 Sept 1998, Jim Gillogly wrote:

> I thought it was an almost-normal radix 21 system, wasn't it?
> So far as I know, those numbers are the only decrypted parts.

The next day Jacques Guy wrote =


> ...
> The page-numbering system is definitely base 21, and fairly easy
> to decode. The only thing that puts you off the track is that
> the Codex is in two parts, and the page numbering starts from 1 again
> in the second part.  The alphabet has uppercase and lowercase
> letters, and the correspondences are not obvious at all (no-one
> has worked them out so far). Chapter titles are all uppercase, and
> the repetitiveness of the letters is very Italian. It gives
> the impression that you ought to be able to decipher them. A false
> impression. I just had a quick look and saw a title with a word
> ending with the same letter repeated three times. Not Italian!
> But could be French: "cr=E9e" if we disregard the accents :-)
> The first word or the first letter of each paragraph is often in
> boldface. The text contains numbers, with are very different from
> the rest of the text.
> Each chapter is preceded by a table of contents. The page titles
> in the table of contents correspond to the titles on those pages,
> but not quite: they are "abbreviated". Usually to a median word,
> and often accents are left out. It is as if you had in the contents
=2E..







-- =

Jim Reeds, AT&T Labs - Research
Shannon Laboratory, Room C229, Building 103
180 Park Avenue, Florham Park, NJ 07932-0971, USA

reeds@research.att.com, phone: +1 973 360 8414, fax: +1 973 360 8178

From jim@mail.rand.org  Tue Jan  4 12:29:56 2000
Return-Path: <jim@mail.rand.org>
Received: from mail-blue.research.att.com (mail-blue.research.att.com [135.207.30.102])
	by fry.research.att.com (980427.SGI.8.8.8/8.8.7) with ESMTP id MAA25511
	for <reeds@fry.research.att.com>; Tue, 4 Jan 2000 12:29:56 -0500 (EST)
Received: by mail-blue.research.att.com (Postfix)
	id 611D04CE12; Tue,  4 Jan 2000 12:29:56 -0500 (EST)
Delivered-To: reeds@research.att.com
Received: from mail02-lax.pilot.net (mail-lax-2.pilot.net [205.139.40.16])
	by mail-blue.research.att.com (Postfix) with ESMTP id 4F9644CE0B
	for <reeds@research.att.com>; Tue,  4 Jan 2000 12:29:51 -0500 (EST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail02-lax.pilot.net with ESMTP id JAA21173; Tue, 4 Jan 2000 09:29:47 -0800 (PST)
Received: from mail.rand.org (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id JAA28006; Tue, 4 Jan 2000 09:29:45 -0800 (PST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail.rand.org (8.9.3/8.9.3) with ESMTP id JAA22808 for <voynich@rand.org>; Tue, 4 Jan 2000 09:29:38 -0800 (PST)
Received: from mail02-lax.pilot.net (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id JAA27981 for <voynich@rand.org>; Tue, 4 Jan 2000 09:29:37 -0800 (PST)
Received: from mailout01.sul.t-online.de (mailout01.sul.t-online.de [194.25.134.80]) by mail02-lax.pilot.net with ESMTP id JAA21082 for <voynich@rand.org>; Tue, 4 Jan 2000 09:29:36 -0800 (PST)
Received: from fwd01.sul.t-online.de 
	by mailout01.sul.t-online.de with smtp 
	id 125Xm3-0000pY-02; Tue, 4 Jan 2000 18:29:35 +0100
Received: from  (0625764225-0001@[62.156.38.25]) by fwd01.sul.t-online.de
	with smtp id 125Xlq-1ULxOiC; Tue, 4 Jan 2000 18:29:22 +0100
From: Zandbergen@t-online.de (Rene)
To: voynich@rand.org
References: <3870E0A7.2C579C88@pixelpark.com>
Subject: Re: Request For Status: Language vs. Cipher / Facsimile Edition
X-Mailer: T-Online eMail 2.3
MIME-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 8BIT
Date: Tue, 4 Jan 2000 18:29:22 +0100
Message-ID: <125Xlq-1ULxOiC@fwd01.sul.t-online.de>
X-Sender: 0625764225-0001@t-dialin.net
Sender: jim@mail.rand.org
Status: OR

Andreas Wilhelm schrieb:

> Hi All,

Hi!

> I am currently trying to wade throught the whole of Jim's compiled
> mailing list traffic (which is a great thing with which to start off!),
> however to shorten my efforts it would be great if I could get a
> summarised status of discussion on two issues:
>
> Language vs. Cipher

This issue has been discussed most actively in a more distant past.
Nothing beats wading through the archive to get all the arguments.

> Are there people actively working on a "decipherment"
> of the book or are people mainly envolved in the structural and
> linguistic (semantic) analysis?

What I think has happened over the last few years is a shift to
more detailed analysis of the structure of the text. This is working
'towards' a solution rather than 'on' one. Thus, the question 
'cipher vs language' is postponed until after the solution is found :-)
Opinions will still differ among the various people, but since there
is little hard evidence (and piles of pointers in all directions)
it is not very useful to have prolongend "yes it is / no it isn't" 
discussions.

> btw: Is the text of Newbold's "decipherment" and possibly the
> counter-arguments available online or digitally?

There is a defender of Newbold on-line. His name is Michel Theroux.
I lost the URL (I intend to include it in my web site).
You will find some of what you're looking for at that site.

Cheers, Rene

From jim@mail.rand.org  Tue Jan  4 16:37:55 2000
Return-Path: <jim@mail.rand.org>
Received: from mail-green.research.att.com (mail-green.research.att.com [135.207.30.103])
	by fry.research.att.com (980427.SGI.8.8.8/8.8.7) with ESMTP id QAA45219
	for <reeds@fry.research.att.com>; Tue, 4 Jan 2000 16:37:55 -0500 (EST)
Received: by mail-green.research.att.com (Postfix)
	id B14701E019; Tue,  4 Jan 2000 16:37:55 -0500 (EST)
Delivered-To: reeds@research.att.com
Received: from mail02-lax.pilot.net (mail-lax-2.pilot.net [205.139.40.16])
	by mail-green.research.att.com (Postfix) with ESMTP id 37AFF1E016
	for <reeds@research.att.com>; Tue,  4 Jan 2000 16:37:55 -0500 (EST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail02-lax.pilot.net with ESMTP id NAA01745; Tue, 4 Jan 2000 13:37:50 -0800 (PST)
Received: from mail.rand.org (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id NAA18666; Tue, 4 Jan 2000 13:37:48 -0800 (PST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail.rand.org (8.9.3/8.9.3) with ESMTP id NAA27278 for <voynich@rand.org>; Tue, 4 Jan 2000 13:37:10 -0800 (PST)
Received: from mail03-lax.pilot.net (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id NAA18603 for <voynich@rand.org>; Tue, 4 Jan 2000 13:37:09 -0800 (PST)
Received: from mailout00.sul.t-online.de (mailout00.sul.t-online.de [194.25.134.16]) by mail03-lax.pilot.net with ESMTP id NAA01068 for <voynich@rand.org>; Tue, 4 Jan 2000 13:37:08 -0800 (PST)
Received: from fwd01.sul.t-online.de 
	by mailout00.sul.t-online.de with smtp 
	id 125bdb-0004fy-03; Tue, 4 Jan 2000 22:37:07 +0100
Received: from  (0625764225-0001@[62.156.38.44]) by fwd01.sul.t-online.de
	with smtp id 125bdM-07m2jYC; Tue, 4 Jan 2000 22:36:52 +0100
From: Zandbergen@t-online.de (Rene)
To: voynich@rand.org
References: <3870E0A7.2C579C88@pixelpark.com> <200001040934.HAA01682@coruja.dcc.unicamp.br>
Subject: Re:  Request For Status: Language vs. Cipher
X-Mailer: T-Online eMail 2.3
MIME-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 8BIT
Date: Tue, 4 Jan 2000 22:36:52 +0100
Message-ID: <125bdM-07m2jYC@fwd01.sul.t-online.de>
X-Sender: 0625764225-0001@t-dialin.net
Sender: jim@mail.rand.org
Status: OR

On Stolfi's post:

> [...] I don't believe it is purely random text (glossolalia, madman's
> drivel, etc.), because it has too much structure and homogeneity, and
> I cannot see how those features could be faked (or even perceived) by
> a 15th century author.

I do not think it's random either. What's also important is that the
structure depends on the context. An apparent list of star names in the
Ms has almost unique words (in the sense that they do appear elsewhere
in the Ms but are almost unique withing the list). The list is 300 items
long and I do not think there is any other stretch of 300 words in the Ms
with so little repetition.

> I cannot believe it is a fraud, either, because the feeling is all
> wrong: it would be like counterfeiting a 3-cent coin. A scholarly
> hoax, say on Baresch or Kircher, is somewhat more likely; but that too
> has its problems.

For a scholarly hoax it is *far* too long. 
(But we should always be careful with 'feelings', since our feelings
belong to a different world than those prevalent in the 15th-17th
Century.) If it's a fake, it would have to be for monetary gain,
or for elevated status. In modern terms: like writing a fake operating
system in order to gain millions. (Sorry, Mike)

> So, almost by exclusion, my current opinion is that Voynichese is a
> rather straightforward encoding (e.g. a phonetic transcription) of
> some "exotic" language; and the structures that we see in it are
> basically those of the language itself. 

Let me surprise you by saying that I tend to agree. I would add that 
the structures may also come from the writing system rather than the
language itself. Of course, my definition of exotic differs a little
from Stolfi's. 

> The "exotic language" theory seems quite plausible historically
> (indeed Baresch himself apparently believed in it), and seems to
> explain many features of the VMS that are hard to explain otherwise.
> For instance, why the text does not include any words in "classical"
> scripts (Roman, Greek, or Hebrew), why there are no number-like
> symbols, why we don't see any grammatical structure, why the plants
> and cosmology look so alien, etc..

Apart from the alienness of plants and cosmology, which is open to 
debate, the above are all valid points in favour of a translation
(or transcirption or encoding) of an 'exotic' language. Depending
on the extent of the author's world, exotic could also include
Hebrew, Arabic, Syriac or Persian, which would not distinguish numerals
from alphabetical characters, and which should not be expected to
include Latin or Greek words or symbols.
There are more exotic possiblities beyond the above, still not being
quite as exotic as the far East...

> An invented language is also a possibility, of course. However, it
> seems that invented languages are either utterly logical, and hence
> utterly unnatural (like Dalgarno's, if I got it right, or
> Loglan/Lojban); or quite similar to the natural languages known to the
> inventor (like Hildegarde's, Esperanto, Enochian, Klingon, etc.). 

Of these, Hildegarde's is most interesting chronologically. Of course,
she only invented a long list of nouns. In the 2 centuries leading up
to the creation of the VMs, someone just might have taken the
next step...

Let's also not forget Roger Bacon, who claimed to have devised a 
'common language' using which he thought he could teach people other
languages in only a fraction of the time needed using a standard
method....

>     > There are a great many papers on entropy and Zipf's laws, but is
>     > there also already a summarized (somewhat final) conclusion of
>     > the findings?
>
> Those studies generally show that there is nothing terribly wrong
> with Voynichese as a natural language, although it doesn't seem
> to be a standard one.

Let's say that in most statistics, the VMs text scores outside the
interval occupied by 'normal' languages. (I'm thinking of languages
written in an alphabetical script with 24-36 characters).

> Some statistics change (sometimes radically) from section to section,
> while others, including the basic "word" structure, are surprisingly
> constant.  There is good evidence that the pages and sections were
> bound in the wrong order.

What seems certain is that they were not bound in the order in which
they were written. The next step implied above is a very reasonable
interpretation. (i.e. I also think they are bound in the wrong order.
I even think we can restore the order, with a little more effort).
 
>                                                  [...] most people
> working on decipherment seem to be assuming a complex code. Under that
> approach, there is not much chance of doing "a little progress" --
> either you crack the code, or you stay stuck at the beginning. 

Yes. And worse: most people who have proposed a solution in the past,
appear to have at one point assumed a theory including author, language,
contents of the MS and everything, and then started translating.
When counter-evidence showed up, this was either discarded or molded 
to fit the theory. (Of course, there may have been those who realised
they were on the wrong path and we just never heard of them).

I think that what some of the list members are doing, namely analysing
the structure of the words, the script, and the phrases, is the
one way in which incremental progress is possible. Little bits of
evidence (or just strange features) have been found over the last
decade and maybe, hopefully, one day someone has the idea that puts
it all together.
   
>     > Is the text of Newbold's "decipherment" and possibly the
>     > counter-arguments available online or digitally?
>
> I don't know whether the text is available through the net. (In fact I
> don't know whether he deciphered more than a few sentences.)

The one thing that may be said about Newbold's solution is that he has
produced a good quantity of grammatical (I think) and totally sensible
plain text. He even produced plain text which included information
that he was supposedly not aware of and which could be verified afterwards.

This brings me to a question I've been wondering about for some time,
and which I will address in a separate post.

Cheers, Rene

From jim@mail.rand.org  Tue Jan  4 19:00:48 2000
Return-Path: <jim@mail.rand.org>
Received: from mail-blue.research.att.com (mail-blue.research.att.com [135.207.30.102])
	by fry.research.att.com (980427.SGI.8.8.8/8.8.7) with ESMTP id TAA69483
	for <reeds@fry.research.att.com>; Tue, 4 Jan 2000 19:00:48 -0500 (EST)
Received: by mail-blue.research.att.com (Postfix)
	id E4A964CE12; Tue,  4 Jan 2000 19:00:47 -0500 (EST)
Delivered-To: reeds@research.att.com
Received: from mail01-lax.pilot.net (mail-lax-1.pilot.net [205.139.40.18])
	by mail-blue.research.att.com (Postfix) with ESMTP id 469BF4CE0C
	for <reeds@research.att.com>; Tue,  4 Jan 2000 19:00:47 -0500 (EST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail01-lax.pilot.net with ESMTP id QAA05193; Tue, 4 Jan 2000 16:00:43 -0800 (PST)
Received: from mail.rand.org (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id QAA29367; Tue, 4 Jan 2000 16:00:41 -0800 (PST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail.rand.org (8.9.3/8.9.3) with ESMTP id QAA14956 for <voynich@rand.org>; Tue, 4 Jan 2000 16:00:33 -0800 (PST)
Received: from mail01-lax.pilot.net (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id QAA29333 for <voynich@rand.org>; Tue, 4 Jan 2000 16:00:29 -0800 (PST)
Received: from scryer.mentat.com ([192.88.122.130]) by mail01-lax.pilot.net with ESMTP id QAA05106 for <voynich@rand.org>; Tue, 4 Jan 2000 16:00:29 -0800 (PST)
Received: from acm.org (localhost [127.0.0.1])
	by scryer.mentat.com (8.8.7/8.8.7) with ESMTP id QAA05930;
	Tue, 4 Jan 2000 16:00:16 -0800
Sender: jim@scryer.mentat.com
Message-ID: <38728990.A947EA5@acm.org>
Date: Wed, 05 Jan 2000 00:00:16 +0000
From: Jim Gillogly <jim@acm.org>
Organization: Banzai Institute
X-Mailer: Mozilla 4.61 [en] (X11; U; Linux 2.2.12 i686)
X-Accept-Language: en
MIME-Version: 1.0
To: voynich@rand.org
Subject: Re: Codex Seraphinianus
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Status: OR

We had a long discussion about the numbering and various other
aspects on the SF-Lovers list in 1987.  Martin Feather started
working it out and ran into some snags, but recognized the
essential radix-21 representation.  The definitive work was
done by Allan C. Wechsler -- as far as I know his is the earliest
complete exposition of the numbering.

Wechsler thought the codex language was a fake, based on an
analysis of "upper-case" characters.  Others weren't convinced.

Bill Newlin wrote in 1994:
:The author is alive and well and living in Italy.  The Dutch publisher
:who participated in the international co-edition (Maarten Asscher of
:Meulenhoff) claims he is a man of "high seriousness," and that he insists
:the language is genuine.
-- 
	Jim Gillogly
	Highday, 13 Afteryule S.R. 2000, 23:49
	12.19.6.15.3, 12 Akbal 11 Kankin, Sixth Lord of Night

From jim@mail.rand.org  Tue Jan  4 21:22:53 2000
Return-Path: <jim@mail.rand.org>
Received: from mail-green.research.att.com (mail-green.research.att.com [135.207.30.103])
	by fry.research.att.com (980427.SGI.8.8.8/8.8.7) with ESMTP id VAA32869
	for <reeds@fry.research.att.com>; Tue, 4 Jan 2000 21:22:53 -0500 (EST)
Received: by mail-green.research.att.com (Postfix)
	id 59B8F1E00D; Tue,  4 Jan 2000 21:22:53 -0500 (EST)
Delivered-To: reeds@research.att.com
Received: from mail01-lax.pilot.net (mail-lax-1.pilot.net [205.139.40.18])
	by mail-green.research.att.com (Postfix) with ESMTP id D44B01E00C
	for <reeds@research.att.com>; Tue,  4 Jan 2000 21:22:52 -0500 (EST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail01-lax.pilot.net with ESMTP id SAA21069; Tue, 4 Jan 2000 18:22:49 -0800 (PST)
Received: from mail.rand.org (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id SAA06578; Tue, 4 Jan 2000 18:22:47 -0800 (PST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail.rand.org (8.9.3/8.9.3) with ESMTP id SAA27145 for <voynich@rand.org>; Tue, 4 Jan 2000 18:22:26 -0800 (PST)
Received: from mail03-lax.pilot.net (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id SAA06557 for <voynich@rand.org>; Tue, 4 Jan 2000 18:22:25 -0800 (PST)
Received: from mail.nctimes.net (mail.nctimes.net [208.239.16.66]) by mail03-lax.pilot.net with ESMTP id SAA00398 for <voynich@rand.org>; Tue, 4 Jan 2000 18:22:24 -0800 (PST)
Received: from nctimes.net ([208.239.20.207]) by mail.nctimes.net
          (Netscape Messaging Server 3.5)  with ESMTP id AAA5C71;
          Tue, 4 Jan 2000 18:19:05 -0800
Message-ID: <3872AA8D.8C4A33CA@nctimes.net>
Date: Tue, 04 Jan 2000 18:21:01 -0800
From: Mark Perakh <perakh@nctimes.net>
Reply-To: perakh@nctimes.net
Organization: home
X-Mailer: Mozilla 4.7 [en] (Win98; I)
X-Accept-Language: en,ru
MIME-Version: 1.0
To: Rene <Zandbergen@t-online.de>
Cc: voynich@rand.org
Subject: Re: Request For Status: Language vs. Cipher
References: <3870E0A7.2C579C88@pixelpark.com> <200001040934.HAA01682@coruja.dcc.unicamp.br> <125bdM-07m2jYC@fwd01.sul.t-online.de>
Content-Type: text/plain; charset=koi8-r
Content-Transfer-Encoding: 7bit
Sender: jim@mail.rand.org
Status: OR



Rene wrote:

> On Stolfi's post:
>
> > [...] I don't believe it is purely random text (glossolalia, madman's
> > drivel, etc.), because it has too much structure and homogeneity, and
> > I cannot see how those features could be faked (or even perceived) by
> > a 15th century author.
>
> I do not think it's random either. What's also important is that the
> structure depends on the context. An apparent list of star names in the
> Ms has almost unique words (in the sense that they do appear elsewhere
> in the Ms but are almost unique withing the list). The list is 300 items
> long and I do not think there is any other stretch of 300 words in the Ms
> with so little repetition.
>
> > I cannot believe it is a fraud, either, because the feeling is all
> > wrong: it would be like counterfeiting a 3-cent coin. A scholarly
> > hoax, say on Baresch or Kircher, is somewhat more likely; but that too
> > has its problems.
>
> For a scholarly hoax it is *far* too long.
> (But we should always be careful with 'feelings', since our feelings
> belong to a different world than those prevalent in the 15th-17th
> Century.) If it's a fake, it would have to be for monetary gain,
> or for elevated status. In modern terms: like writing a fake operating
> system in order to gain millions. (Sorry, Mike)
>
> > So, almost by exclusion, my current opinion is that Voynichese is a
> > rather straightforward encoding (e.g. a phonetic transcription) of
> > some "exotic" language; and the structures that we see in it are
> > basically those of the language itself.
>
> Let me surprise you by saying that I tend to agree. I would add that
> the structures may also come from the writing system rather than the
> language itself. Of course, my definition of exotic differs a little
> from Stolfi's.
>
> > The "exotic language" theory seems quite plausible historically
> > (indeed Baresch himself apparently believed in it), and seems to
> > explain many features of the VMS that are hard to explain otherwise.
> > For instance, why the text does not include any words in "classical"
> > scripts (Roman, Greek, or Hebrew), why there are no number-like
> > symbols, why we don't see any grammatical structure, why the plants
> > and cosmology look so alien, etc..
>
> Apart from the alienness of plants and cosmology, which is open to
> debate, the above are all valid points in favour of a translation
> (or transcirption or encoding) of an 'exotic' language. Depending
> on the extent of the author's world, exotic could also include
> Hebrew, Arabic, Syriac or Persian, which would not distinguish numerals
> from alphabetical characters, and which should not be expected to
> include Latin or Greek words or symbols.
> There are more exotic possiblities beyond the above, still not being
> quite as exotic as the far East...
>
> > An invented language is also a possibility, of course. However, it
> > seems that invented languages are either utterly logical, and hence
> > utterly unnatural (like Dalgarno's, if I got it right, or
> > Loglan/Lojban); or quite similar to the natural languages known to the
> > inventor (like Hildegarde's, Esperanto, Enochian, Klingon, etc.).
>
> Of these, Hildegarde's is most interesting chronologically. Of course,
> she only invented a long list of nouns. In the 2 centuries leading up
> to the creation of the VMs, someone just might have taken the
> next step...
>
> Let's also not forget Roger Bacon, who claimed to have devised a
> 'common language' using which he thought he could teach people other
> languages in only a fraction of the time needed using a standard
> method....
>
> >     > There are a great many papers on entropy and Zipf's laws, but is
> >     > there also already a summarized (somewhat final) conclusion of
> >     > the findings?
> >
> > Those studies generally show that there is nothing terribly wrong
> > with Voynichese as a natural language, although it doesn't seem
> > to be a standard one.
>
> Let's say that in most statistics, the VMs text scores outside the
> interval occupied by 'normal' languages. (I'm thinking of languages
> written in an alphabetical script with 24-36 characters).
>
> > Some statistics change (sometimes radically) from section to section,
> > while others, including the basic "word" structure, are surprisingly
> > constant.  There is good evidence that the pages and sections were
> > bound in the wrong order.
>
> What seems certain is that they were not bound in the order in which
> they were written. The next step implied above is a very reasonable
> interpretation. (i.e. I also think they are bound in the wrong order.
> I even think we can restore the order, with a little more effort).
>
> >                                                  [...] most people
> > working on decipherment seem to be assuming a complex code. Under that
> > approach, there is not much chance of doing "a little progress" --
> > either you crack the code, or you stay stuck at the beginning.
>
> Yes. And worse: most people who have proposed a solution in the past,
> appear to have at one point assumed a theory including author, language,
> contents of the MS and everything, and then started translating.
> When counter-evidence showed up, this was either discarded or molded
> to fit the theory. (Of course, there may have been those who realised
> they were on the wrong path and we just never heard of them).
>
> I think that what some of the list members are doing, namely analysing
> the structure of the words, the script, and the phrases, is the
> one way in which incremental progress is possible. Little bits of
> evidence (or just strange features) have been found over the last
> decade and maybe, hopefully, one day someone has the idea that puts
> it all together.
>
> >     > Is the text of Newbold's "decipherment" and possibly the
> >     > counter-arguments available online or digitally?
> >
> > I don't know whether the text is available through the net. (In fact I
> > don't know whether he deciphered more than a few sentences.)
>
> The one thing that may be said about Newbold's solution is that he has
> produced a good quantity of grammatical (I think) and totally sensible
> plain text. He even produced plain text which included information
> that he was supposedly not aware of and which could be verified afterwards.
>
> This brings me to a question I've been wondering about for some time,
> and which I will address in a separate post.
>
> Cheers, Rene

In response to the most recent exchange of views, apparently prompted by
Andreas' questions, I take the liberty of reminding the following: A few months
ago I suggested some thoughts about VMs based on the application of Letter
Serial Correlation (LSC) test.  The only response I ever received was from
Rene, whose opinion was that the two articles on my website were relevant to
the discussion.  I don't know if anybody besides Rene has ever read those two
articles (Frogguy once said he would do so, but never said he actually did) and
if anybody did, was the absence of reaction due to the opinion that it all was
crock or that it was just nothing new, or to the unwillingness to delve into
LSC method..  All three views/approaches are legitimate.  However, I would like
to say that the LSC test in itself was proven to reasonably well distinguish
between meaningful texts,  gibberish and artifically created texts with either
low or high entropy.  The application of LSC to VMS resulted in the data
exactly like those obtained for meaningful texts in 12 natural languages. The
LSC tests also revealed a considerable difference between VMS-A and VMS-B,
leading to a conclusion (arguable, of course) that both were written in the
same language but A using much more abbreviations than B.  The empirical
measure of the overall entropy, based mainly on LSC data, placed both A and B
right within the range for those 12 languages.  On the other hand the letter
frequency distribution (assuming every symbol in VMS is a letter) was much more
non-uniform than in any natural texts explored. There was also a division into
vowels and consonants suggested (again, arguable). That is a brief summary of
those data.  Of course I would rather see rebuttals than silence, but
apparently it is not going to happen. Anyway, I had fun doing that study, so it
was not completely in vain.  Best to all.  Mark

From jim@mail.rand.org  Wed Jan  5 00:27:42 2000
Return-Path: <jim@mail.rand.org>
Received: from mail-green.research.att.com (mail-green.research.att.com [135.207.30.103])
	by fry.research.att.com (980427.SGI.8.8.8/8.8.7) with ESMTP id AAA40091
	for <reeds@fry.research.att.com>; Wed, 5 Jan 2000 00:27:41 -0500 (EST)
Received: by mail-green.research.att.com (Postfix)
	id CF1E41E00C; Wed,  5 Jan 2000 00:27:41 -0500 (EST)
Delivered-To: reeds@research.att.com
Received: from mail01-lax.pilot.net (mail-lax-1.pilot.net [205.139.40.18])
	by mail-green.research.att.com (Postfix) with ESMTP id 578C61E009
	for <reeds@research.att.com>; Wed,  5 Jan 2000 00:27:41 -0500 (EST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail01-lax.pilot.net with ESMTP id VAA21259; Tue, 4 Jan 2000 21:27:37 -0800 (PST)
Received: from mail.rand.org (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id VAA11613; Tue, 4 Jan 2000 21:27:36 -0800 (PST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail.rand.org (8.9.3/8.9.3) with ESMTP id VAA02459 for <voynich@rand.org>; Tue, 4 Jan 2000 21:27:13 -0800 (PST)
Received: from mail02-lax.pilot.net (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id VAA11588 for <voynich@rand.org>; Tue, 4 Jan 2000 21:27:12 -0800 (PST)
Received: from grande.dcc.unicamp.br (grande.dcc.unicamp.br [143.106.7.8]) by mail02-lax.pilot.net with ESMTP id VAA27327 for <voynich@rand.org>; Tue, 4 Jan 2000 21:27:08 -0800 (PST)
Received: from amazonas.dcc.unicamp.br (amazonas.dcc.unicamp.br [143.106.7.11])
	by grande.dcc.unicamp.br (8.9.3/8.9.3) with ESMTP id DAA04855
	for <voynich@rand.org>; Wed, 5 Jan 2000 03:27:02 -0200 (EDT)
Received: from coruja.dcc.unicamp.br (coruja.dcc.unicamp.br [143.106.24.80])
	by amazonas.dcc.unicamp.br (8.8.5/8.8.5) with ESMTP id DAA14379
	for <voynich@rand.org>; Wed, 5 Jan 2000 03:27:01 -0200 (EDT)
Received: (from stolfi@localhost)
	by coruja.dcc.unicamp.br (8.8.5/8.8.5) id DAA03275;
	Wed, 5 Jan 2000 03:27:00 -0200 (EDT)
Date: Wed, 5 Jan 2000 03:27:00 -0200 (EDT)
Message-Id: <200001050527.DAA03275@coruja.dcc.unicamp.br>
From: Jorge Stolfi <stolfi@dcc.unicamp.br>
To: voynich@rand.org
Subject: Re:  Request For Status: Language vs. Cipher
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
Content-Type: text/plain; charset=iso-8859-1
In-Reply-To: <125bdM-07m2jYC@fwd01.sul.t-online.de>
References: <3870E0A7.2C579C88@pixelpark.com>
	<200001040934.HAA01682@coruja.dcc.unicamp.br>
	<125bdM-07m2jYC@fwd01.sul.t-online.de>
Reply-To: stolfi@dcc.unicamp.br
Sender: jim@mail.rand.org
Status: OR


    > [Rene:] ... apart from the alienness of plants and cosmology,
    > which is open to debate ...
    
Note that Baresch himself did not recognize the plants. Moreover, his
letter seems to say that he showed the VMS around to people in Germany
(Bohemia?), they could not identify them either.

I would say that the failure of those people is much more significant
than the failure of modern botanists and paleographers. Baresch and
his friends surely knew many of the classical medical plants, *and* 
the standard pictorial "language" of medieval herbals ---
better than any modern expert.

    > What seems certain is that they were not bound in the order in which
    > they were written. The next step implied above is a very reasonable
    > interpretation. (i.e. I also think they are bound in the wrong order.

I was thinking of the two-page "heated baths" illustration in the
biological section (f78v+f81r).  (I don't recall who noticed the 
channel connecting the two halves.)  Surely that bifolio was
supposed to be a centerfold.

All the best,

--stolfi

From jim@mail.rand.org  Tue Jan  4 18:42:36 2000
Return-Path: <jim@mail.rand.org>
Received: from mail-green.research.att.com (mail-green.research.att.com [135.207.30.103])
	by fry.research.att.com (980427.SGI.8.8.8/8.8.7) with ESMTP id SAA50526
	for <reeds@fry.research.att.com>; Tue, 4 Jan 2000 18:42:36 -0500 (EST)
Received: by mail-green.research.att.com (Postfix)
	id 569681E013; Tue,  4 Jan 2000 18:42:36 -0500 (EST)
Delivered-To: reeds@research.att.com
Received: from mail01-lax.pilot.net (mail-lax-1.pilot.net [205.139.40.18])
	by mail-green.research.att.com (Postfix) with ESMTP id AEB3D1E010
	for <reeds@research.att.com>; Tue,  4 Jan 2000 18:42:35 -0500 (EST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail01-lax.pilot.net with ESMTP id PAA28405; Tue, 4 Jan 2000 15:42:31 -0800 (PST)
Received: from mail.rand.org (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id PAA28185; Tue, 4 Jan 2000 15:42:30 -0800 (PST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail.rand.org (8.9.3/8.9.3) with ESMTP id PAA13084 for <voynich@rand.org>; Tue, 4 Jan 2000 15:42:16 -0800 (PST)
Received: from mail02-lax.pilot.net (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id PAA28156 for <voynich@rand.org>; Tue, 4 Jan 2000 15:42:15 -0800 (PST)
Received: from mail.alphalink.com.au (mail.alphalink.com.au [203.24.205.7]) by mail02-lax.pilot.net with ESMTP id PAA16393 for <voynich@rand.org>; Tue, 4 Jan 2000 15:42:12 -0800 (PST)
Received: from LOCALNAME (d28-as5-mel.alphalink.com.au [202.161.96.155])
	by mail.alphalink.com.au (8.9.3/8.9.3) with SMTP id KAA29440
	for <voynich@rand.org>; Wed, 5 Jan 2000 10:42:06 +1100
Message-ID: <387303A2.34AD@alphalink.com.au>
Date: Wed, 05 Jan 2000 00:41:06 -0800
From: Jacques Guy <jguy@alphalink.com.au>
Reply-To: jguy@alphalink.com.au
X-Mailer: Mozilla 3.01 (Win16; I)
MIME-Version: 1.0
To: voynich@rand.org
Subject: Re: Codex Serafinianus (sp?)
References: <121A9702EC4@is-fs13.bham.ac.uk> <125XUN-18AloOC@fwd07.sul.t-online.de>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Sender: jim@mail.rand.org
Status: OR

phi, not fi.

I do not remember who it was who discovered
that page numberes followed a base-21 system.
I came across the name the other day during
a Web search, but the his pages had disappeared
(Error 404).

I found this this morning;


http://www.math.bas.bg/~iad/serafin.html

The author, Ivan Derzhanksi, is doing with the Codex
what we have been doing with the VMS. There is very
little yet, but fascinating stuff. He does not explain
his transliteration system, though. I ought
to try and figure it out, but I have been lazy.

Frogguy

From jim@mail.rand.org  Wed Jan  5 11:03:00 2000
Return-Path: <jim@mail.rand.org>
Received: from mail-blue.research.att.com (mail-blue.research.att.com [135.207.30.102])
	by fry.research.att.com (980427.SGI.8.8.8/8.8.7) with ESMTP id LAA41245
	for <reeds@fry.research.att.com>; Wed, 5 Jan 2000 11:03:00 -0500 (EST)
Received: by mail-blue.research.att.com (Postfix)
	id ED6314CE1F; Wed,  5 Jan 2000 11:02:59 -0500 (EST)
Delivered-To: reeds@research.att.com
Received: from mail03-lax.pilot.net (mail-lax-3.pilot.net [205.139.40.17])
	by mail-blue.research.att.com (Postfix) with ESMTP id 597444CE1A
	for <reeds@research.att.com>; Wed,  5 Jan 2000 11:02:59 -0500 (EST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail03-lax.pilot.net with ESMTP id IAA28105; Wed, 5 Jan 2000 08:02:56 -0800 (PST)
Received: from mail.rand.org (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id IAA01498; Wed, 5 Jan 2000 08:02:55 -0800 (PST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail.rand.org (8.9.3/8.9.3) with ESMTP id IAA24787 for <voynich@rand.org>; Wed, 5 Jan 2000 08:02:35 -0800 (PST)
Received: from mail01-lax.pilot.net (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id IAA01458 for <voynich@rand.org>; Wed, 5 Jan 2000 08:02:35 -0800 (PST)
Received: from mailout02.sul.t-online.de (mailout02.sul.t-online.de [194.25.134.17]) by mail01-lax.pilot.net with ESMTP id IAA13182 for <voynich@rand.org>; Wed, 5 Jan 2000 08:02:34 -0800 (PST)
Received: from fwd02.sul.t-online.de 
	by mailout02.sul.t-online.de with smtp 
	id 125stN-0001uO-01; Wed, 5 Jan 2000 17:02:33 +0100
Received: from  (0625764225-0001@[62.156.38.224]) by fwd02.sul.t-online.de
	with smtp id 125st7-0YOlKCC; Wed, 5 Jan 2000 17:02:17 +0100
From: Zandbergen@t-online.de (Rene)
To: voynich@rand.org
References: <3870E0A7.2C579C88@pixelpark.com>
	 <200001040934.HAA01682@coruja.dcc.unicamp.br>
	 <125bdM-07m2jYC@fwd01.sul.t-online.de> <3872AA8D.8C4A33CA@nctimes.net>
Subject: Re:  Request For Status: Language vs. Cipher
X-Mailer: T-Online eMail 2.3
MIME-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 8BIT
Date: Wed, 5 Jan 2000 17:02:17 +0100
Message-ID: <125st7-0YOlKCC@fwd02.sul.t-online.de>
X-Sender: 0625764225-0001@t-dialin.net
Sender: jim@mail.rand.org
Status: OR

Stolfi's and Mark's responses make it necessary that I review my
earlier statements. True, while we have always been used to seeing
the statistics of the Voynich MS text being rather different from
normal languages, there are now also a number of categories where
'Voynichese' scores right among these. Marks results are among
them. More about that a bit later. Also, the graphs on my page:
http://www.voynich.nu/wordent.html show a normal 'vocabulary
build-up'. Maybe there is more, but I'm overlooking it right now.

About Mark's LSC papers, I remember reading the one about the
Voynich MS and coming to the conclusion that it looks very interesting
but I had no idea what the numbers were actually representing.
So I went to the other papers and started reading from the beginning.
Following the math required a bit of time, during which I didn't
want to be on-line. I also got stuck in some of the definitions.
I copied the pages to my disk and realised I forgot a few of the
formulas, which are embedded gifs IIRC. I meant to ask Mark a few
detailed questions but never got that far. I also wanted to include
a summary on my web page but for that I first had to understand
the details. (The placeholder is there :-/)

Anyway, to cut an already too long story short, I meant to read on
but never got round to it. I still think there may be important
clues in there, so I recommend all to look at it.

Mark wrote:

> if anybody did [read the articles], was the absence of reaction due
> to the opinion that it all was crock or that it was just nothing new,
> or to the unwillingness to delve into LSC method.. 

Surely not the first two! Perhaps some variety of the last?
 
> However, I would like to say that the LSC test in itself was proven
> to reasonably well distinguish between meaningful texts,  gibberish
> and artifically created texts with either low or high entropy. 

What is important is to understand precisely how these non-meaningful
texts were generated, and how and why the method identifies them
as such. Then the conclusions about the VMs text can be fully 
appreciated. This requires reading several of the articles...

> The application of LSC to VMS resulted in the data
> exactly like those obtained for meaningful texts in 12 natural languages. The
> LSC tests also revealed a considerable difference between VMS-A and VMS-B,
> leading to a conclusion (arguable, of course) that both were written in the
> same language but A using much more abbreviations than B.

I remember being somewhat uneasy about that last conclusion, but I will
hold back until I really understand the method.

>  Of course I would rather see rebuttals than silence, but
> apparently it is not going to happen.

Don't despair just yet. I have no idea how many people started reading
it. The silence probably means lack of understanding rather than 
disagreement. Mistakes are usually pointed out in a correct, friendly
manner in this list.

Cheers, Rene

From jim@mail.rand.org  Wed Jan  5 16:06:40 2000
Return-Path: <jim@mail.rand.org>
Received: from mail-green.research.att.com (mail-green.research.att.com [135.207.30.103])
	by fry.research.att.com (980427.SGI.8.8.8/8.8.7) with ESMTP id QAA87949
	for <reeds@fry.research.att.com>; Wed, 5 Jan 2000 16:06:40 -0500 (EST)
Received: by mail-green.research.att.com (Postfix)
	id 8B3E41E00C; Wed,  5 Jan 2000 16:06:40 -0500 (EST)
Delivered-To: reeds@research.att.com
Received: from mail01-lax.pilot.net (mail-lax-1.pilot.net [205.139.40.18])
	by mail-green.research.att.com (Postfix) with ESMTP id ED7F71E009
	for <reeds@research.att.com>; Wed,  5 Jan 2000 16:06:39 -0500 (EST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail01-lax.pilot.net with ESMTP id NAA04225; Wed, 5 Jan 2000 13:05:42 -0800 (PST)
Received: from mail.rand.org (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id NAA27734; Wed, 5 Jan 2000 13:05:40 -0800 (PST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail.rand.org (8.9.3/8.9.3) with ESMTP id NAA06969 for <voynich@rand.org>; Wed, 5 Jan 2000 13:05:02 -0800 (PST)
Received: from mail01-lax.pilot.net (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id NAA27612 for <voynich@rand.org>; Wed, 5 Jan 2000 13:05:01 -0800 (PST)
Received: from mailout02.sul.t-online.de (mailout02.sul.t-online.de [194.25.134.17]) by mail01-lax.pilot.net with ESMTP id NAA03812 for <voynich@rand.org>; Wed, 5 Jan 2000 13:05:00 -0800 (PST)
Received: from fwd06.sul.t-online.de 
	by mailout02.sul.t-online.de with smtp 
	id 125xc3-0005dN-0G; Wed, 5 Jan 2000 22:04:59 +0100
Received: from  (0625764225-0001@[62.156.12.34]) by fwd06.sul.t-online.de
	with smtp id 125xc0-1iOYEqC; Wed, 5 Jan 2000 22:04:56 +0100
From: Zandbergen@t-online.de (Rene)
To: voynich@rand.org
Subject: Newbold article
X-Mailer: T-Online eMail 2.3
MIME-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 8BIT
Date: Wed, 5 Jan 2000 22:04:56 +0100
Message-ID: <125xc0-1iOYEqC@fwd06.sul.t-online.de>
X-Sender: 0625764225-0001@t-dialin.net
Sender: jim@mail.rand.org
Status: OR

I found a link to the previsouly mentioned article by Michael
Theroux (in defense of Newbold) at Mark Perakh's site.
It is at:
http://www.borderlands.com/archives/arch/decipher.htm
Note that this 'advertisement' does not imply support of
the contents :-)

Cheers, Rene

From jim@mail.rand.org  Wed Jan  5 16:15:35 2000
Return-Path: <jim@mail.rand.org>
Received: from mail-blue.research.att.com (mail-blue.research.att.com [135.207.30.102])
	by fry.research.att.com (980427.SGI.8.8.8/8.8.7) with ESMTP id QAA89032
	for <reeds@fry.research.att.com>; Wed, 5 Jan 2000 16:15:34 -0500 (EST)
Received: by mail-blue.research.att.com (Postfix)
	id CFA054CE1B; Wed,  5 Jan 2000 16:15:34 -0500 (EST)
Delivered-To: reeds@research.att.com
Received: from mail01-lax.pilot.net (mail-lax-1.pilot.net [205.139.40.18])
	by mail-blue.research.att.com (Postfix) with ESMTP id 643B74CE19
	for <reeds@research.att.com>; Wed,  5 Jan 2000 16:15:34 -0500 (EST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail01-lax.pilot.net with ESMTP id NAA08130; Wed, 5 Jan 2000 13:15:14 -0800 (PST)
Received: from mail.rand.org (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id NAA28466; Wed, 5 Jan 2000 13:15:12 -0800 (PST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail.rand.org (8.9.3/8.9.3) with ESMTP id NAA08143 for <voynich@rand.org>; Wed, 5 Jan 2000 13:15:03 -0800 (PST)
Received: from mail03-lax.pilot.net (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id NAA28440 for <voynich@rand.org>; Wed, 5 Jan 2000 13:15:03 -0800 (PST)
Received: from mail.nctimes.net (mail.nctimes.net [208.239.16.66]) by mail03-lax.pilot.net with ESMTP id NAA02381 for <voynich@rand.org>; Wed, 5 Jan 2000 13:15:02 -0800 (PST)
Received: from nctimes.net ([208.239.20.90]) by mail.nctimes.net
          (Netscape Messaging Server 3.5)  with ESMTP id AAA2D41;
          Wed, 5 Jan 2000 13:11:13 -0800
Message-ID: <3873B3E4.91545B8F@nctimes.net>
Date: Wed, 05 Jan 2000 13:13:08 -0800
From: Mark Perakh <perakh@nctimes.net>
Reply-To: perakh@nctimes.net
Organization: home
X-Mailer: Mozilla 4.7 [en] (Win98; I)
X-Accept-Language: en,ru
MIME-Version: 1.0
To: jguy@alphalink.com.au
Cc: voynich@rand.org
Subject: Re: Request For Status: Language vs. Cipher
References: <3870E0A7.2C579C88@pixelpark.com>
			 <200001040934.HAA01682@coruja.dcc.unicamp.br>
			 <125bdM-07m2jYC@fwd01.sul.t-online.de> <3872AA8D.8C4A33CA@nctimes.net> <125st7-0YOlKCC@fwd02.sul.t-online.de> <3873FEB4.2F97@alphalink.com.au>
Content-Type: text/plain; charset=koi8-r
Content-Transfer-Encoding: 7bit
Sender: jim@mail.rand.org
Status: OR


Thanks to Rene and Frogguy for the response. In regard to Jacques comment, the texts
subjected to LSC test (the total of 69 meaningful texts) included first the Book of
Genesis in 12 languages, then Moby Dick, War and Peace (in English and Hebrew),
collections of short stories in English and Russian, the UN convention on sea trade, a
full  text of a Russian newspaper, etc.  Meaningless texts included an artificially
created gibberish, artificially created "almost-zero entropy"  texts, as well as texts
obtained by permutations of either letters, or verses, or words of meaningful texts.
The LSC measurement produced, as the direct outcome, curves of the so-called LSC sum,
which displayed characteristic minima and other features, quite clearly differing for
meaningful texts compared to all other types of texts.  Additional information was
extracted by manipulating the LSC sum curves.  The curves for VMS (obtained separately
for A and B and also for the full version) looked exactly like those for all
meaningful texts, and quite differently compared to meaningless conglomerates of
letters. Anybody who feels uncomfortable with math, can omit the first paper on LSC
(or read it diagonally, without delving deeply into formulas derivation) and look at
the experimental material (again not necessaily chewing every detail) in the second,
third, and fourth articles.  After that the two articles on VMS should become rather
comprehensible (I hope).


Jacques Guy wrote:

>
>
> > Stolfi's and Mark's responses make it necessary that I review my
> > earlier statements. True, while we have always been used to seeing
> > the statistics of the Voynich MS text being rather different from
> > normal languages
>
> Count me out on that. A browse through the archives will reveal
> all my rantings.
>
> > About Mark's LSC papers
>
> Oh... I downloaded them, even complaining to Mark that he
> should have split them into frogguy-size bites, started
> reading, realized  that it would be tough reading (I am
> no mathematician). As it was late, rather, early in the
> morning, I set it aside... and populating that  Web
> site on Easter Island side-tracked me. I had completely
> forgotten that Mark's papers were there, in o:\perakh
> right next to the Easter Island stuff (o:\rr) <-- no,
> that was not an emoticon!
>
>
> > Anyway, to cut an already too long story short, I meant to read on
> > but never got round to it.
>
> Join the club.
>
> > I still think there may be important
> > clues in there, so I recommend all to look at it.
>
> We don't know anything about language. And "we" includes
> me as a linguist. I received a book to review "The Origins
> of Complex Language," OUP. It's all wrong. Even the *data*
> are wrong! We don't know how language works. Imagine
> Newton blind and without the accumulated observations of
> earlier astronomers (Tycho Brahe). You have a mathematician
> tackling language.
>
>
> > What is important is to understand precisely how these non-meaningful
> > texts were generated, and how and why the method identifies them
> > as such.
>
> Remember that many texts are really meaningless. Nursery rhymes
> for instance. And, behind its turgid, convoluted argumentation
> "The Origin of Complex Languages" is meaningless. The author
> is hypnotized with his own words and does not realize that they
> make no sense. Consider now meaningful texts which look meaningless:
> a bill of lading, even... yes, I think even cookery recipes
> would look pretty meaningless, and very repetitive if we
> did not know what they were. I have here a cookery book,
> rather, a menu book dating from the last... no! second last
> century (1800's). For every day of the year, a menu. In
> appendix, the recipes for some of the dishes. Think! Let
> it be written in Voynichese (if there is such a thing).
> I'll OCR a few pages and post them here. (It's in English,
> translated from French)
>
> I expect that menus and bills of lading would share many
> statistical properties. Novels, diaries, very different
> from menus and bills of lading. Cookery recipes (or
> alchemical recipes) different again. I posted here quite
> some time ago short excerpts of classical Aztec in translation.
> It is very different from any European literature.
>
> [Mark:]
> > > The application of LSC to VMS resulted in the data
> > > exactly like those obtained for meaningful texts in 12 natural languages.
>
> Hmmm.... the test ought to be applied to "non-meaningful"
> texts. We can't use a monkey-like text generator, because
> humans are very bad at generating random stuff. What is
> a meaningless text? Dennis I think it was directed us to
> a site where there was a corpus of texts from schizophrenics.
> Perhaps that would do. But the litanies of the Holy Virgin
> too, would have "schizophrenic" statistical properties, I
> expect (Ave Maria, stella maris; Ave Maria, gratia plena;
> Ave...)
>
> > > The LSC tests also revealed a considerable difference between VMS-A and VMS-B,
> > > leading to a conclusion (arguable, of course) that both were written in the
> > > same language but A using much more abbreviations than B.
>
> > I remember being somewhat uneasy about that last conclusion, but I will
> > hold back until I really understand the method.
>
> Same here. But I haven't really attempted.
>
> > The silence probably means lack of understanding rather than
> > disagreement.
>
> I haven't even started to try to understand. I really have to
> get off this Easter Island stuff for a while. It is fascinating
> stuff, though, and I can be sure that it is meaningful...
>
> ... well... NO!  I have become persuaded that some of the
> tablets (the London tablet, the Stephen-Chauvet fragment)
> are fakes. So, a part of the corpus is certainly meaningful
> (the lunar calendar), but another is probably meaningless.
>
>
> > Cheers, Rene
>
> You're the optimistic one! I am far from cheerful about all  that :-(
>
> Frogguy

From jim@mail.rand.org  Wed Jan  5 12:34:27 2000
Return-Path: <jim@mail.rand.org>
Received: from mail-blue.research.att.com (mail-blue.research.att.com [135.207.30.102])
	by fry.research.att.com (980427.SGI.8.8.8/8.8.7) with ESMTP id MAA30944
	for <reeds@fry.research.att.com>; Wed, 5 Jan 2000 12:34:27 -0500 (EST)
Received: by mail-blue.research.att.com (Postfix)
	id 482CF4CE19; Wed,  5 Jan 2000 12:34:27 -0500 (EST)
Delivered-To: reeds@research.att.com
Received: from mail03-lax.pilot.net (mail-lax-3.pilot.net [205.139.40.17])
	by mail-blue.research.att.com (Postfix) with ESMTP id C2B064CE1B
	for <reeds@research.att.com>; Wed,  5 Jan 2000 12:34:26 -0500 (EST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail03-lax.pilot.net with ESMTP id JAA02964; Wed, 5 Jan 2000 09:33:45 -0800 (PST)
Received: from mail.rand.org (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id JAA08985; Wed, 5 Jan 2000 09:33:44 -0800 (PST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail.rand.org (8.9.3/8.9.3) with ESMTP id JAA06244 for <voynich@rand.org>; Wed, 5 Jan 2000 09:33:31 -0800 (PST)
Received: from mail02-lax.pilot.net (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id JAA08952 for <voynich@rand.org>; Wed, 5 Jan 2000 09:33:30 -0800 (PST)
Received: from mail.alphalink.com.au (mail.alphalink.com.au [203.24.205.7]) by mail02-lax.pilot.net with ESMTP id JAA05874 for <voynich@rand.org>; Wed, 5 Jan 2000 09:33:27 -0800 (PST)
Received: from LOCALNAME (d20-as10-mel.alphalink.com.au [202.161.97.115])
	by mail.alphalink.com.au (8.9.3/8.9.3) with SMTP id EAA03212
	for <voynich@rand.org>; Thu, 6 Jan 2000 04:33:15 +1100
Message-ID: <3873FEB4.2F97@alphalink.com.au>
Date: Wed, 05 Jan 2000 18:32:20 -0800
From: Jacques Guy <jguy@alphalink.com.au>
Reply-To: jguy@alphalink.com.au
X-Mailer: Mozilla 3.01 (Win16; I)
MIME-Version: 1.0
To: voynich@rand.org
Subject: Re: Request For Status: Language vs. Cipher
References: <3870E0A7.2C579C88@pixelpark.com>
		 <200001040934.HAA01682@coruja.dcc.unicamp.br>
		 <125bdM-07m2jYC@fwd01.sul.t-online.de> <3872AA8D.8C4A33CA@nctimes.net> <125st7-0YOlKCC@fwd02.sul.t-online.de>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Sender: jim@mail.rand.org
Status: OR

Rene wrote:
 
> Stolfi's and Mark's responses make it necessary that I review my
> earlier statements. True, while we have always been used to seeing
> the statistics of the Voynich MS text being rather different from
> normal languages

Count me out on that. A browse through the archives will reveal
all my rantings.


> About Mark's LSC papers

Oh... I downloaded them, even complaining to Mark that he
should have split them into frogguy-size bites, started
reading, realized  that it would be tough reading (I am
no mathematician). As it was late, rather, early in the 
morning, I set it aside... and populating that  Web
site on Easter Island side-tracked me. I had completely
forgotten that Mark's papers were there, in o:\perakh
right next to the Easter Island stuff (o:\rr) <-- no,
that was not an emoticon!

 
> Anyway, to cut an already too long story short, I meant to read on
> but never got round to it.

Join the club.

> I still think there may be important
> clues in there, so I recommend all to look at it.

We don't know anything about language. And "we" includes
me as a linguist. I received a book to review "The Origins
of Complex Language," OUP. It's all wrong. Even the *data*
are wrong! We don't know how language works. Imagine
Newton blind and without the accumulated observations of
earlier astronomers (Tycho Brahe). You have a mathematician
tackling language.

 
> What is important is to understand precisely how these non-meaningful
> texts were generated, and how and why the method identifies them
> as such. 

Remember that many texts are really meaningless. Nursery rhymes
for instance. And, behind its turgid, convoluted argumentation
"The Origin of Complex Languages" is meaningless. The author
is hypnotized with his own words and does not realize that they
make no sense. Consider now meaningful texts which look meaningless:
a bill of lading, even... yes, I think even cookery recipes
would look pretty meaningless, and very repetitive if we
did not know what they were. I have here a cookery book,
rather, a menu book dating from the last... no! second last
century (1800's). For every day of the year, a menu. In 
appendix, the recipes for some of the dishes. Think! Let
it be written in Voynichese (if there is such a thing).
I'll OCR a few pages and post them here. (It's in English,
translated from French) 

I expect that menus and bills of lading would share many
statistical properties. Novels, diaries, very different
from menus and bills of lading. Cookery recipes (or
alchemical recipes) different again. I posted here quite
some time ago short excerpts of classical Aztec in translation.
It is very different from any European literature.

[Mark:] 
> > The application of LSC to VMS resulted in the data
> > exactly like those obtained for meaningful texts in 12 natural languages. 

Hmmm.... the test ought to be applied to "non-meaningful"
texts. We can't use a monkey-like text generator, because
humans are very bad at generating random stuff. What is
a meaningless text? Dennis I think it was directed us to
a site where there was a corpus of texts from schizophrenics.
Perhaps that would do. But the litanies of the Holy Virgin
too, would have "schizophrenic" statistical properties, I
expect (Ave Maria, stella maris; Ave Maria, gratia plena;
Ave...)


> > The LSC tests also revealed a considerable difference between VMS-A and VMS-B,
> > leading to a conclusion (arguable, of course) that both were written in the
> > same language but A using much more abbreviations than B.
 
> I remember being somewhat uneasy about that last conclusion, but I will
> hold back until I really understand the method.

Same here. But I haven't really attempted.

> The silence probably means lack of understanding rather than
> disagreement. 

I haven't even started to try to understand. I really have to
get off this Easter Island stuff for a while. It is fascinating
stuff, though, and I can be sure that it is meaningful...

... well... NO!  I have become persuaded that some of the
tablets (the London tablet, the Stephen-Chauvet fragment)
are fakes. So, a part of the corpus is certainly meaningful
(the lunar calendar), but another is probably meaningless.

 
> Cheers, Rene

You're the optimistic one! I am far from cheerful about all  that :-(

Frogguy

From jim@mail.rand.org  Thu Jan  6 19:58:51 2000
Return-Path: <jim@mail.rand.org>
Received: from mail-blue.research.att.com (mail-blue.research.att.com [135.207.30.102])
	by fry.research.att.com (980427.SGI.8.8.8/8.8.7) with ESMTP id TAA35265
	for <reeds@fry.research.att.com>; Thu, 6 Jan 2000 19:58:51 -0500 (EST)
Received: by mail-blue.research.att.com (Postfix)
	id F110A4CE14; Thu,  6 Jan 2000 19:58:51 -0500 (EST)
Delivered-To: reeds@research.att.com
Received: from mail01-lax.pilot.net (mail-lax-1.pilot.net [205.139.40.18])
	by mail-blue.research.att.com (Postfix) with ESMTP id 7778E4CE07
	for <reeds@research.att.com>; Thu,  6 Jan 2000 19:58:50 -0500 (EST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail01-lax.pilot.net with ESMTP id QAA20043; Thu, 6 Jan 2000 16:58:45 -0800 (PST)
Received: from mail.rand.org (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id QAA29007; Thu, 6 Jan 2000 16:58:44 -0800 (PST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail.rand.org (8.9.3/8.9.3) with ESMTP id QAA24463 for <voynich@rand.org>; Thu, 6 Jan 2000 16:56:34 -0800 (PST)
Received: from mail01-lax.pilot.net (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id QAA28901 for <voynich@rand.org>; Thu, 6 Jan 2000 16:56:33 -0800 (PST)
Received: from mail.nctimes.net (mail.nctimes.net [208.239.16.66]) by mail01-lax.pilot.net with ESMTP id QAA19415 for <voynich@rand.org>; Thu, 6 Jan 2000 16:56:33 -0800 (PST)
Received: from nctimes.net ([208.239.20.2]) by mail.nctimes.net
          (Netscape Messaging Server 3.5)  with ESMTP id AAAD06
          for <voynich@rand.org>; Thu, 6 Jan 2000 16:53:15 -0800
Message-ID: <38753992.F4EF1E0C@nctimes.net>
Date: Thu, 06 Jan 2000 16:55:46 -0800
From: Mark Perakh <perakh@nctimes.net>
Reply-To: perakh@nctimes.net
Organization: home
X-Mailer: Mozilla 4.7 [en] (Win98; I)
X-Accept-Language: en,ru
MIME-Version: 1.0
To: voynich@rand.org
Subject: Landini's request
Content-Type: text/plain; charset=koi8-r
Content-Transfer-Encoding: 7bit
Sender: jim@mail.rand.org
Status: OR

Hi, Gabriel: in response to your request, I emailed to you the URL of my
site containing LSC stuff using the <reply>  button.  I received a
message saying that univ of Birmingham blocks unsolicited messages and
therefore my message was not delivered.  I am trying once again, this
time using the list and I hope those list members who are not interested
in that URL will forgive me since I see no other way to get through to
you.  Please let me know if you received this message.  The URL you
asked for is www.bigfoot.com/~perakh/Texts/    Best, Mark

From jim@mail.rand.org  Thu Jan  6 20:47:19 2000
Return-Path: <jim@mail.rand.org>
Received: from mail-green.research.att.com (mail-green.research.att.com [135.207.30.103])
	by fry.research.att.com (980427.SGI.8.8.8/8.8.7) with ESMTP id UAA93449
	for <reeds@fry.research.att.com>; Thu, 6 Jan 2000 20:47:19 -0500 (EST)
Received: by mail-green.research.att.com (Postfix)
	id C07C01E012; Thu,  6 Jan 2000 20:47:19 -0500 (EST)
Delivered-To: reeds@research.att.com
Received: from mail03-lax.pilot.net (mail-lax-3.pilot.net [205.139.40.17])
	by mail-green.research.att.com (Postfix) with ESMTP id 48E831E004
	for <reeds@research.att.com>; Thu,  6 Jan 2000 20:47:19 -0500 (EST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail03-lax.pilot.net with ESMTP id RAA28835; Thu, 6 Jan 2000 17:47:16 -0800 (PST)
Received: from mail.rand.org (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id RAA01500; Thu, 6 Jan 2000 17:47:14 -0800 (PST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail.rand.org (8.9.3/8.9.3) with ESMTP id RAA28502 for <voynich@rand.org>; Thu, 6 Jan 2000 17:47:05 -0800 (PST)
Received: from mail01-lax.pilot.net (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id RAA01485 for <voynich@rand.org>; Thu, 6 Jan 2000 17:47:04 -0800 (PST)
Received: from m8.jersey.juno.com (m8.jersey.juno.com [209.67.34.63]) by mail01-lax.pilot.net with ESMTP id RAA01107 for <voynich@rand.org>; Thu, 6 Jan 2000 17:47:03 -0800 (PST)
Received: "G/Y1UiIukdgwzh/sJHRf3iveq/kAgaXs9NWyLteYMv079g7wSQtxlA=="
Received: (from atlan56@juno.com)
 by m8.jersey.juno.com (queuemail) id EVPQJFWH; Thu, 06 Jan 2000 20:46:43 EST
To: voynich@rand.org
Subject: newsletter info requested
Message-ID: <20000106.214441.23575.0.atlan56@juno.com>
X-Mailer: Juno 1.49
X-Juno-Line-Breaks: 0-4
From: nic m stepro <atlan56@juno.com>
Date: Thu, 06 Jan 2000 20:46:43 EST
Sender: jim@mail.rand.org
Status: OR

please send info on your e-mail list


thank you,
nicolette

________________________________________________________________
YOU'RE PAYING TOO MUCH FOR THE INTERNET!
Juno now offers FREE Internet Access!
Try it today - there's no risk!  For your FREE software, visit:
http://dl.www.juno.com/get/tagj.

From jim@mail.rand.org  Fri Jan  7 05:03:32 2000
Return-Path: <jim@mail.rand.org>
Received: from mail-green.research.att.com (mail-green.research.att.com [135.207.30.103])
	by fry.research.att.com (980427.SGI.8.8.8/8.8.7) with ESMTP id FAA00225
	for <reeds@fry.research.att.com>; Fri, 7 Jan 2000 05:03:32 -0500 (EST)
Received: by mail-green.research.att.com (Postfix)
	id A886C1E00E; Fri,  7 Jan 2000 05:03:32 -0500 (EST)
Delivered-To: reeds@research.att.com
Received: from mail02-lax.pilot.net (mail-lax-2.pilot.net [205.139.40.16])
	by mail-green.research.att.com (Postfix) with ESMTP id EA5921E004
	for <reeds@research.att.com>; Fri,  7 Jan 2000 05:03:31 -0500 (EST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail02-lax.pilot.net with ESMTP id CAA04135; Fri, 7 Jan 2000 02:02:29 -0800 (PST)
Received: from mail.rand.org (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id CAA13208; Fri, 7 Jan 2000 02:02:28 -0800 (PST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail.rand.org (8.9.3/8.9.3) with ESMTP id CAA12500 for <voynich@rand.org>; Fri, 7 Jan 2000 02:01:58 -0800 (PST)
Received: from mail01-lax.pilot.net (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id CAA13195 for <voynich@rand.org>; Fri, 7 Jan 2000 02:01:58 -0800 (PST)
Received: from mailer3.bham.ac.uk (mailer3.bham.ac.uk [147.188.128.54]) by mail01-lax.pilot.net with ESMTP id CAA10166 for <voynich@rand.org>; Fri, 7 Jan 2000 02:01:57 -0800 (PST)
Received: from bham.ac.uk ([147.188.128.127])
	by mailer3.bham.ac.uk with esmtp (Exim 3.02 #16)
	id 126WDU-0005A2-00
	for voynich@rand.org; Fri, 07 Jan 2000 10:01:56 +0000
Received: from is-fs13.bham.ac.uk ([147.188.128.55])
	by bham.ac.uk with esmtp (Exim 3.10 #1)
	id 126WDT-0006Oy-04
	for voynich@rand.org; Fri, 07 Jan 2000 10:01:55 +0000
Received: from BHAM-IS-FS13/SpoolDir by is-fs13.bham.ac.uk (Mercury 1.44);
    7 Jan 00 10:01:55 +0000
Received: from SpoolDir by BHAM-IS-FS13 (Mercury 1.44); 7 Jan 00 10:01:54 +0000
Received: from golem (147.188.72.20) by is-fs13.bham.ac.uk (Mercury 1.44) with ESMTP;
    7 Jan 00 10:01:45 +0000
From: "Gabriel Landini" <G.Landini@bham.ac.uk>
Organization: The University of Birmingham, UK.
To: voynich@rand.org
Date: Fri, 7 Jan 2000 10:00:39 -0000
MIME-Version: 1.0
Content-type: text/plain; charset=US-ASCII
Content-transfer-encoding: 7BIT
Subject: Re: Landini's request
Reply-To: G.Landini@bham.ac.uk
In-reply-to: <38753992.F4EF1E0C@nctimes.net>
X-mailer: Pegasus Mail for Win32 (v3.12a)
Message-ID: <2E7D085675@is-fs13.bham.ac.uk>
Sender: jim@mail.rand.org
Status: OR

On 6 Jan 00, at 16:55, Mark Perakh wrote:
> Hi, Gabriel: in response to your request, I emailed to you the URL of
> my site containing LSC stuff using the <reply>  button.  I received a
> message saying that univ of Birmingham blocks unsolicited messages and
> therefore my message was not delivered. 

Hello Mark et al,
Please forgive my reply to the list, but in case anybody else had 
problems sending me mail, then this may be a solution.

I enquired here and they told me the University subscribes to a 
system that blocks e-mails coming from sources (ISPs, not users!) 
that in the past have been sending junk mail. 
This of course prevents everybody else to have messages 
delivered from those ISPs. I presume that it is a kind of 
"punishment" to ISPs that do not bother to keep control of junk mail.

They tell me that it is impossible for the Univ. to change that list, but 
that if you forward the error message to your service provider 
(usually postmaster@"yourISP")  they will be able to sort it out. 

Or, perhaps your ISP uses yet another route which has been 
identified as a source of junk mail.

I hope that this will help to solve the problem. Please let me know 
what happens.

Gabriel

PS: thanks for the URL!






From jim@mail.rand.org  Sun Jan  9 22:37:49 2000
Return-Path: <jim@mail.rand.org>
Received: from mail-green.research.att.com (mail-green.research.att.com [135.207.30.103])
	by fry.research.att.com (980427.SGI.8.8.8/8.8.7) with ESMTP id WAA92497
	for <reeds@fry.research.att.com>; Sun, 9 Jan 2000 22:37:49 -0500 (EST)
Received: by mail-green.research.att.com (Postfix)
	id EFEF31E018; Sun,  9 Jan 2000 22:37:49 -0500 (EST)
Delivered-To: reeds@research.att.com
Received: from mail03-lax.pilot.net (mail-lax-3.pilot.net [205.139.40.17])
	by mail-green.research.att.com (Postfix) with ESMTP id 6A2FC1E01C
	for <reeds@research.att.com>; Sun,  9 Jan 2000 22:37:48 -0500 (EST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail03-lax.pilot.net with ESMTP id TAA28544; Sun, 9 Jan 2000 19:37:44 -0800 (PST)
Received: from mail.rand.org (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id TAA26835; Sun, 9 Jan 2000 19:37:43 -0800 (PST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail.rand.org (8.9.3/8.9.3) with ESMTP id TAA18243 for <voynich@rand.org>; Sun, 9 Jan 2000 19:37:31 -0800 (PST)
From: RSRICHMOND@aol.com
Received: from mail03-lax.pilot.net (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id TAA26785 for <voynich@rand.org>; Sun, 9 Jan 2000 19:37:30 -0800 (PST)
Received: from imo12.mx.aol.com (imo12.mx.aol.com [152.163.225.2]) by mail03-lax.pilot.net with ESMTP id SAA18035 for <voynich@rand.org>; Sun, 9 Jan 2000 18:28:28 -0800 (PST)
Received: from RSRICHMOND@aol.com
	by imo12.mx.aol.com (mail_out_v24.6.) id 6.12.12df90e8 (4330)
	 for <voynich@rand.org>; Sun, 9 Jan 2000 21:27:55 -0500 (EST)
Message-ID: <12.12df90e8.25aa9dab@aol.com>
Date: Sun, 9 Jan 2000 21:27:55 EST
Subject: Re:  An invented language (or a hoax?)
To: voynich@rand.org
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
X-Mailer: AOL 3.0.1 for Mac sub 78
Sender: jim@mail.rand.org
Status: OR

Indeed this strange tale Frogguy bears from
http://www.geocities.com/aladgyma/stories/sodom/intro.htm
of a story so repulsive that the writer fears it would cause real harm if 
anyone ever read it, so - in order to publish it! - he claims to have 
translated it into a constructed language whose documentation he then 
destroyed -

it sort of reminds me of the story of the man who went to the limerick 
convention and heard The Grossest Limerick in the World. When he got home, he 
told his wife he had heard The Grossest Limerick in the World, and she 
demanded to hear it. "No," he replied, "it's just too disgusting." His wife 
persisted, and finally he relented and recited it, saying that where it was 
just TOO bad, he'd just recite the scansion. And it went:
taDAda taDAda taDA,
taDAda taDAda taDA.
taDAda taDAda
taDAda taDAda
taDAda taDAda ta f*ck.

Bob Richmond
Samurai Pathologist
Knoxville, Tennessee USA

From jim@mail.rand.org  Sun Jan  9 16:23:44 2000
Return-Path: <jim@mail.rand.org>
Received: from mail-blue.research.att.com (mail-blue.research.att.com [135.207.30.102])
	by fry.research.att.com (980427.SGI.8.8.8/8.8.7) with ESMTP id QAA85552
	for <reeds@fry.research.att.com>; Sun, 9 Jan 2000 16:23:44 -0500 (EST)
Received: by mail-blue.research.att.com (Postfix)
	id F0EFC4CE06; Sun,  9 Jan 2000 16:23:44 -0500 (EST)
Delivered-To: reeds@research.att.com
Received: from mail03-lax.pilot.net (mail-lax-3.pilot.net [205.139.40.17])
	by mail-blue.research.att.com (Postfix) with ESMTP id 50D804CE03
	for <reeds@research.att.com>; Sun,  9 Jan 2000 16:23:43 -0500 (EST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail03-lax.pilot.net with ESMTP id NAA03980; Sun, 9 Jan 2000 13:23:34 -0800 (PST)
Received: from mail.rand.org (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id NAA19981; Sun, 9 Jan 2000 13:23:33 -0800 (PST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail.rand.org (8.9.3/8.9.3) with ESMTP id NAA11294 for <voynich@rand.org>; Sun, 9 Jan 2000 13:22:41 -0800 (PST)
Received: from mail01-lax.pilot.net (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id NAA19964 for <voynich@rand.org>; Sun, 9 Jan 2000 13:22:40 -0800 (PST)
Received: from mail.alphalink.com.au (mail.alphalink.com.au [203.24.205.7]) by mail01-lax.pilot.net with ESMTP id NAA27872 for <voynich@rand.org>; Sun, 9 Jan 2000 13:22:38 -0800 (PST)
Received: from LOCALNAME (d19-as7-mel.alphalink.com.au [202.161.96.210])
	by mail.alphalink.com.au (8.9.3/8.9.3) with SMTP id IAA11281
	for <voynich@rand.org>; Mon, 10 Jan 2000 08:22:27 +1100
Message-ID: <38796C54.489C@alphalink.com.au>
Date: Sun, 09 Jan 2000 21:21:24 -0800
From: Jacques Guy <jguy@alphalink.com.au>
Reply-To: jguy@alphalink.com.au
X-Mailer: Mozilla 3.01 (Win16; I)
MIME-Version: 1.0
To: voynich@rand.org
Subject: An invented language (or a hoax?)
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Sender: jim@mail.rand.org
Status: OR

Bonjour tout le monde, (or: qochedy okeedy in V.)

trying to find out about a certain Simon Whitechapel
who has started posting on sci.lang, I hit pay dirt:

http://www.geocities.com/aladgyma/stories/sodom/intro.htm

It's geocities, so don't forget to disable JavaScript,
so not te be pestered by the pop-up consoles.

Frogguy.

From jim@mail.rand.org  Mon Jan 10 00:54:53 2000
Return-Path: <jim@mail.rand.org>
Received: from mail-green.research.att.com (mail-green.research.att.com [135.207.30.103])
	by fry.research.att.com (980427.SGI.8.8.8/8.8.7) with ESMTP id AAA99726
	for <reeds@fry.research.att.com>; Mon, 10 Jan 2000 00:54:53 -0500 (EST)
Received: by mail-green.research.att.com (Postfix)
	id 3C7C81E00E; Mon, 10 Jan 2000 00:54:53 -0500 (EST)
Delivered-To: reeds@research.att.com
Received: from mail01-lax.pilot.net (mail-lax-1.pilot.net [205.139.40.18])
	by mail-green.research.att.com (Postfix) with ESMTP id ACEA61E008
	for <reeds@research.att.com>; Mon, 10 Jan 2000 00:54:52 -0500 (EST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail01-lax.pilot.net with ESMTP id VAA17829; Sun, 9 Jan 2000 21:54:49 -0800 (PST)
Received: from mail.rand.org (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id VAA29279; Sun, 9 Jan 2000 21:54:48 -0800 (PST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail.rand.org (8.9.3/8.9.3) with ESMTP id VAA20404 for <voynich@rand.org>; Sun, 9 Jan 2000 21:54:33 -0800 (PST)
Received: from mail02-oak.pilot.net (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id VAA29266 for <voynich@rand.org>; Sun, 9 Jan 2000 21:54:32 -0800 (PST)
Received: from mail.alphalink.com.au (mail.alphalink.com.au [203.24.205.7]) by mail02-oak.pilot.net with ESMTP id VAA15325 for <voynich@rand.org>; Sun, 9 Jan 2000 21:54:29 -0800 (PST)
Received: from LOCALNAME (d09-ds2-mel.alphalink.com.au [202.161.101.137])
	by mail.alphalink.com.au (8.9.3/8.9.3) with SMTP id QAA31053
	for <voynich@rand.org>; Mon, 10 Jan 2000 16:53:50 +1100
Message-ID: <3879E42F.5413@alphalink.com.au>
Date: Mon, 10 Jan 2000 05:52:47 -0800
From: Jacques Guy <jguy@alphalink.com.au>
Reply-To: jguy@alphalink.com.au
X-Mailer: Mozilla 3.01 (Win16; I)
MIME-Version: 1.0
To: voynich@rand.org
Subject: Re: An invented language (or a hoax?)
References: <12.12df90e8.25aa9dab@aol.com>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Sender: jim@mail.rand.org
Status: OR

RSRICHMOND@aol.com wrote:
 
> Indeed this strange tale Frogguy bears from
> http://www.geocities.com/aladgyma/stories/sodom/intro.htm
> of a story so repulsive that the writer fears it would cause real harm if
> anyone ever read it, so - in order to publish it! - he claims to have
> translated it into a constructed language whose documentation he then
> destroyed -

You know, I was not having my tongue in my cheek. Luigi Serafini claims
that
his Codex is written in a real language. By "real" he only means that it
is
not glossalalia, I gather.  This fellow, Simon Whitechapel, who posts on
sci.lang as amygdala (and whose e-mail is aladgyma), claims something
similar.  Imagine that Jorge or Mark were to waste their time analyzing
Amygdalese, independently, and without telling Whitechapel (but I doubt
that there is quite enough data  there).  *Then* we  ask Amygdala about
the truth of it, in the  full knowledge that he might tell us a lie.
The same could be done with the Codex Seraphinianus, except that the
effort of transcribing it would not make it worthwhile. However, I have
a suggestion for Jorge (I think this is right down  his line of
research).
The writing of the Codex is extremely regular,  and clear. Say I  scan
it.
Jorge, can you figure out an optical character-recognition algorithm
that would turn that  into an ascii transcription?  I strongly suspect
the problem  is not so very difficult. I had given much thought to
it for  the Easter Island hieroglyphs, which are so astonishingly
modular, and, in my ignorance, I came to believe it could be done. I
think
it would have very general applications, too. If I wasn't so committed
now
to dispelling misconceptions about those hieroglyphs, I'd give  it a
try.
But unlike the VMS, those hieroglyphs have attracted nothing but kooks
and madmen. Those Augean stables need cleaning. (Call me Herakles)

That was

... yet another unhinged  idea/suggestion/ranting/thinking-aloud
from....

yours qoteedily!

(okedy qocheedy dain!)

From jim@mail.rand.org  Mon Jan 10 16:07:07 2000
Return-Path: <jim@mail.rand.org>
Received: from mail-blue.research.att.com (mail-blue.research.att.com [135.207.30.102])
	by fry.research.att.com (980427.SGI.8.8.8/8.8.7) with ESMTP id QAA03117
	for <reeds@fry.research.att.com>; Mon, 10 Jan 2000 16:07:06 -0500 (EST)
Received: by mail-blue.research.att.com (Postfix)
	id A882E4CE16; Mon, 10 Jan 2000 16:07:06 -0500 (EST)
Delivered-To: reeds@research.att.com
Received: from mail01-lax.pilot.net (mail-lax-1.pilot.net [205.139.40.18])
	by mail-blue.research.att.com (Postfix) with ESMTP id EA5234CE15
	for <reeds@research.att.com>; Mon, 10 Jan 2000 16:07:05 -0500 (EST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail01-lax.pilot.net with ESMTP id NAA21548; Mon, 10 Jan 2000 13:06:55 -0800 (PST)
Received: from mail.rand.org (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id NAA13423; Mon, 10 Jan 2000 13:06:51 -0800 (PST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail.rand.org (8.9.3/8.9.3) with ESMTP id NAA27594 for <voynich@rand.org>; Mon, 10 Jan 2000 13:06:03 -0800 (PST)
Received: from mail02-lax.pilot.net (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id NAA13323 for <voynich@rand.org>; Mon, 10 Jan 2000 13:06:02 -0800 (PST)
Received: from mailout02.sul.t-online.de (mailout02.sul.t-online.de [194.25.134.17]) by mail02-lax.pilot.net with ESMTP id NAA27463 for <voynich@rand.org>; Mon, 10 Jan 2000 13:06:01 -0800 (PST)
Received: from fwd00.sul.t-online.de 
	by mailout02.sul.t-online.de with smtp 
	id 127m0l-0001Mm-06; Mon, 10 Jan 2000 22:05:59 +0100
Received: from  (0625764225-0001@[62.158.1.31]) by fwd00.sul.t-online.de
	with smtp id 127m0Y-0BDymeC; Mon, 10 Jan 2000 22:05:46 +0100
From: Zandbergen@t-online.de (Rene)
To: voynich@rand.org
References: <12.12df90e8.25aa9dab@aol.com> <3879E42F.5413@alphalink.com.au>
Subject: Re:  An invented language (or a hoax?)
X-Mailer: T-Online eMail 2.3
MIME-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 8BIT
Date: Mon, 10 Jan 2000 22:05:46 +0100
Message-ID: <127m0Y-0BDymeC@fwd00.sul.t-online.de>
X-Sender: 0625764225-0001@t-dialin.net
Sender: jim@mail.rand.org
Status: OR

Frogguy wrote:

> the effort of transcribing [the Codex Serafinianus] would not make it 
> worthwhile. However, I have a suggestion for Jorge (I think this is right
> down  his line of research).
> The writing of the Codex is extremely regular,  and clear. Say I  scan
> it.

It being printed presents a big advantage over our beloved VMs.
Does the script consist of loose characters or is it a cursive 
script like the VMs? If the former, cannot you just tell most 
standard OCR S/W that a  certain character it doesn't recognise
is an 'A', another one is really a 'b' etc, until it recognizes
everything?

> Jorge, can you figure out an optical character-recognition algorithm
> that would turn that  into an ascii transcription?  I strongly suspect
> the problem  is not so very difficult. I had given much thought to
> it for  the Easter Island hieroglyphs, which are so astonishingly
> modular, and, in my ignorance, I came to believe it could be done. I
> think it would have very general applications, too. If I wasn't so
> committed now to dispelling misconceptions about those hieroglyphs,
> I'd give  it a try.

Sounds like a good student project. To put a student on something
useless like a Voynichese OCR would be a bit iffy, but if it is 
something that could be used for Rongorongo and is perhaps general
enough to be used _also_ for other undeciphered scripts, now that
is something one would not have to be ashamed of proposing to a 
student. 
(By itself, the Rongorongo corpus is probably hardly large enough
to warrant programming an OCR...)

> But unlike the VMS, those hieroglyphs have attracted nothing but kooks
> and madmen.

Surely you jest :-)

Cheers, Rene


From jim@mail.rand.org  Tue Jan 11 16:19:26 2000
Return-Path: <jim@mail.rand.org>
Received: from mail-green.research.att.com (mail-green.research.att.com [135.207.30.103])
	by fry.research.att.com (980427.SGI.8.8.8/8.8.7) with ESMTP id QAA10649
	for <reeds@fry.research.att.com>; Tue, 11 Jan 2000 16:19:26 -0500 (EST)
Received: by mail-green.research.att.com (Postfix)
	id 60DDA1E019; Tue, 11 Jan 2000 16:19:26 -0500 (EST)
Delivered-To: reeds@research.att.com
Received: from mail01-lax.pilot.net (mail-lax-1.pilot.net [205.139.40.18])
	by mail-green.research.att.com (Postfix) with ESMTP id BF2481E022
	for <reeds@research.att.com>; Tue, 11 Jan 2000 16:19:25 -0500 (EST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail01-lax.pilot.net with ESMTP id NAA19012; Tue, 11 Jan 2000 13:19:20 -0800 (PST)
Received: from mail.rand.org (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id NAA05031; Tue, 11 Jan 2000 13:19:17 -0800 (PST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail.rand.org (8.9.3/8.9.3) with ESMTP id NAA18431 for <voynich@rand.org>; Tue, 11 Jan 2000 13:18:18 -0800 (PST)
Received: from mail01-lax.pilot.net (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id NAA04919 for <voynich@rand.org>; Tue, 11 Jan 2000 13:18:17 -0800 (PST)
Received: from mailout00.sul.t-online.de (mailout00.sul.t-online.de [194.25.134.16]) by mail01-lax.pilot.net with ESMTP id NAA18497 for <voynich@rand.org>; Tue, 11 Jan 2000 13:18:16 -0800 (PST)
Received: from fwd01.sul.t-online.de 
	by mailout00.sul.t-online.de with smtp 
	id 1288gB-00049D-00; Tue, 11 Jan 2000 22:18:15 +0100
Received: from  (0625764225-0001@[62.158.124.21]) by fwd01.sul.t-online.de
	with smtp id 1288g3-1ct1UmC; Tue, 11 Jan 2000 22:18:07 +0100
From: Zandbergen@t-online.de (Rene)
To: voynich@rand.org
Subject: Fortean times article
X-Mailer: T-Online eMail 2.3
MIME-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 8BIT
Date: Tue, 11 Jan 2000 22:18:07 +0100
Message-ID: <1288g3-1ct1UmC@fwd01.sul.t-online.de>
X-Sender: 0625764225-0001@t-dialin.net
Sender: jim@mail.rand.org
Status: OR

Gabriel recently announced a new article about the VMs in the Fortean Times.
It provides an interesting reshuffling of the usual information - only, the
title is 'Maze of Madness' and the subtitle 'plenty have gone mad in the
search'. Gabriel already complained about this unfortunate emphasis.

(Of course, it's possible that we're all mad but just don't realize it.)

Anyway...

It's got some nice illustrations and some very doubtful ones. There is
a portrait of Roger Bacon which looks alarmingly familiar (*), and also
a supposed portrait of John Dee, who turns out to look exactly like
the ancient Greek mathematician/astronomer/geographer Ptolemy (**) (depicted
as geographer). Aptly, this portrait is overlapping one of the most 
obviously astronomical diagrams, the one with the 12 sectors and the
7 'planet names'.

*) I can't remember where I saw this portrait before. He looks just like one
   of the early American presidents
**) Obviously, this portrait of Ptolemy is not contemporary but, unless I
   am much mistaken, dates from the renaissance.

Prompted by that, I hereby submit, without evidence, that the 20th quire
of the Voynich MS, which contains the so-called 'stars' or 'recipes' section,
might well be a summary of Ptolemy's 'geography' (e.g. like the 'list of
cities' found in some later summaries of his work). If anything, this fits
better with the remainder of the MS. 'Medical' recipes are already given
in the pharma section. 'Alchemical' recipes belong in an alchemical MS
which the VMs almost certainly isn't. Geography as a science belongs with
astronomy and astrology, which are well represented. Also missing is
'meteorology', which could be represented by some of the pages labelled
as 'cosmological'.


Cheers, Rene

From jim@mail.rand.org  Wed Jan 12 05:25:09 2000
Return-Path: <jim@mail.rand.org>
Received: from mail-green.research.att.com (mail-green.research.att.com [135.207.30.103])
	by fry.research.att.com (980427.SGI.8.8.8/8.8.7) with ESMTP id FAA93464
	for <reeds@fry.research.att.com>; Wed, 12 Jan 2000 05:25:09 -0500 (EST)
Received: by mail-green.research.att.com (Postfix)
	id BB2FC1E009; Wed, 12 Jan 2000 05:25:09 -0500 (EST)
Delivered-To: reeds@research.att.com
Received: from mail03-lax.pilot.net (mail-lax-3.pilot.net [205.139.40.17])
	by mail-green.research.att.com (Postfix) with ESMTP id 3B82E1E008
	for <reeds@research.att.com>; Wed, 12 Jan 2000 05:25:09 -0500 (EST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail03-lax.pilot.net with ESMTP id CAA07253; Wed, 12 Jan 2000 02:25:05 -0800 (PST)
Received: from mail.rand.org (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id CAA15639; Wed, 12 Jan 2000 02:25:04 -0800 (PST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail.rand.org (8.9.3/8.9.3) with ESMTP id CAA07944 for <voynich@rand.org>; Wed, 12 Jan 2000 02:24:16 -0800 (PST)
Received: from mail02-lax.pilot.net (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id CAA15589 for <voynich@rand.org>; Wed, 12 Jan 2000 02:24:15 -0800 (PST)
Received: from red.omnisig.com (root@red.omnisig.com [207.107.11.8]) by mail02-lax.pilot.net with ESMTP id BAA16952 for <voynich@rand.org>; Wed, 12 Jan 2000 01:20:36 -0800 (PST)
Received: from omnisig.com ([172.16.20.29])
	by red.omnisig.com (8.9.3/8.9.3) with ESMTP id EAA06861
	for <voynich@rand.org>; Wed, 12 Jan 2000 04:20:21 -0500
Message-ID: <387BA0E4.FA8FE17E@omnisig.com>
Date: Tue, 11 Jan 2000 16:30:12 -0500
From: John Grove <jgrove@omnisig.com>
X-Mailer: Mozilla 4.61 [en] (WinNT; I)
X-Accept-Language: en
MIME-Version: 1.0
To: voynich@rand.org
Subject: Arabic alchemical documents
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Sender: jim@mail.rand.org
Status: OR


	In case anyone was interested in Arabic manuscripts on the subject of
medicine, herbals, and alchemy... I found this site.

http://www.nlm.nih.gov/exhibition/islamic_medical/islamic_11.html

	John Grove.

(Anyone game for transliterating and computing the entropy of the text?)

From jim@mail.rand.org  Wed Jan 12 16:33:22 2000
Return-Path: <jim@mail.rand.org>
Received: from mail-green.research.att.com (mail-green.research.att.com [135.207.30.103])
	by fry.research.att.com (980427.SGI.8.8.8/8.8.7) with ESMTP id QAA10238
	for <reeds@fry.research.att.com>; Wed, 12 Jan 2000 16:33:21 -0500 (EST)
Received: by mail-green.research.att.com (Postfix)
	id 3AED71E02F; Wed, 12 Jan 2000 16:33:12 -0500 (EST)
Delivered-To: reeds@research.att.com
Received: from mail01-lax.pilot.net (mail-lax-1.pilot.net [205.139.40.18])
	by mail-green.research.att.com (Postfix) with ESMTP id 9F31D1E021
	for <reeds@research.att.com>; Wed, 12 Jan 2000 16:33:11 -0500 (EST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail01-lax.pilot.net with ESMTP id NAA16954; Wed, 12 Jan 2000 13:32:50 -0800 (PST)
Received: from mail.rand.org (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id NAA28841; Wed, 12 Jan 2000 13:32:48 -0800 (PST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail.rand.org (8.9.3/8.9.3) with ESMTP id NAA08147 for <voynich@rand.org>; Wed, 12 Jan 2000 13:32:25 -0800 (PST)
Received: from mail03-lax.pilot.net (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id NAA28806 for <voynich@rand.org>; Wed, 12 Jan 2000 13:32:24 -0800 (PST)
Received: from grande.dcc.unicamp.br (grande.dcc.unicamp.br [143.106.7.8]) by mail03-lax.pilot.net with ESMTP id NAA03986 for <voynich@rand.org>; Wed, 12 Jan 2000 13:32:22 -0800 (PST)
Received: from amazonas.dcc.unicamp.br (amazonas.dcc.unicamp.br [143.106.7.11])
	by grande.dcc.unicamp.br (8.9.3/8.9.3) with ESMTP id TAA04706;
	Wed, 12 Jan 2000 19:31:31 -0200 (EDT)
Received: from coruja.dcc.unicamp.br (coruja.dcc.unicamp.br [143.106.24.80])
	by amazonas.dcc.unicamp.br (8.8.5/8.8.5) with ESMTP id TAA08641;
	Wed, 12 Jan 2000 19:31:29 -0200 (EDT)
Received: (from stolfi@localhost)
	by coruja.dcc.unicamp.br (8.8.5/8.8.5) id TAA01385;
	Wed, 12 Jan 2000 19:31:26 -0200 (EDT)
Date: Wed, 12 Jan 2000 19:31:26 -0200 (EDT)
Message-Id: <200001122131.TAA01385@coruja.dcc.unicamp.br>
From: Jorge Stolfi <stolfi@dcc.unicamp.br>
To: jguy@alphalink.com.au
Cc: voynich@rand.org
Subject: Re: An invented language (or a hoax?)
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
Content-Type: text/plain; charset=iso-8859-1
In-Reply-To: <3879E42F.5413@alphalink.com.au>
References: <12.12df90e8.25aa9dab@aol.com>
	<3879E42F.5413@alphalink.com.au>
Reply-To: stolfi@dcc.unicamp.br
Sender: jim@mail.rand.org
Status: OR


    > However, I have a suggestion for Jorge (I think this is right
    > down his line of research). The writing of the Codex is
    > extremely regular, and clear. Say I scan it. Jorge, can you
    > figure out an optical character-recognition algorithm that would
    > turn that into an ascii transcription? I strongly suspect the
    > problem is not so very difficult.
    
Unfortunately I have no experience with optical character-recognition.
But our VMS fellow Andras Kornai (hello, are you still there?) has
published a paper or two on the subject, which I got from his
homepage; and his methods seem quite appropriate to semi-cursive
scripts like "Serafinese".
    
    > I had given much thought to it for the Easter Island
    > hieroglyphs, which are so astonishingly modular, and, in my
    > ignorance, I came to believe it could be done. I think it would
    > have very general applications, too.
    
I imagine that Mayanists could use that technology too,
although they don't seem to be at that stage yet:

  Maya Hieroglyphic Database Project
  http://outreach.ucdavis.edu/programs/maya2.htm
  
  Mayan Epigraphic Database (MED) Project
  http://jefferson.village.virginia.edu/med/glyph_catalog.html

By the way, Mayanists don't know the whole alphabet yet, so they have
the same transcription problems we have (Are these two glyphs the
same? Is this one glyph or two?) Only that theirs is much worse...

Plus they have the problem of figuring out the reading order, which we
fortunately don't have.

All the best,

--stolfi

From jim@mail.rand.org  Wed Jan 12 17:08:52 2000
Return-Path: <jim@mail.rand.org>
Received: from mail-blue.research.att.com (mail-blue.research.att.com [135.207.30.102])
	by fry.research.att.com (980427.SGI.8.8.8/8.8.7) with ESMTP id RAA18658
	for <reeds@fry.research.att.com>; Wed, 12 Jan 2000 17:08:51 -0500 (EST)
Received: by mail-blue.research.att.com (Postfix)
	id CD2EB4CE33; Wed, 12 Jan 2000 17:08:51 -0500 (EST)
Delivered-To: reeds@research.att.com
Received: from mail01-lax.pilot.net (mail-lax-1.pilot.net [205.139.40.18])
	by mail-blue.research.att.com (Postfix) with ESMTP id 5BB5B4CE2F
	for <reeds@research.att.com>; Wed, 12 Jan 2000 17:08:51 -0500 (EST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail01-lax.pilot.net with ESMTP id OAA02066; Wed, 12 Jan 2000 14:08:45 -0800 (PST)
Received: from mail.rand.org (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id OAA01825; Wed, 12 Jan 2000 14:08:42 -0800 (PST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail.rand.org (8.9.3/8.9.3) with ESMTP id OAA13233 for <voynich@rand.org>; Wed, 12 Jan 2000 14:08:30 -0800 (PST)
Received: from mail02-lax.pilot.net (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id OAA01754 for <voynich@rand.org>; Wed, 12 Jan 2000 14:08:22 -0800 (PST)
Received: from mail.nctimes.net (mail.nctimes.net [208.239.16.66]) by mail02-lax.pilot.net with ESMTP id OAA04489 for <voynich@rand.org>; Wed, 12 Jan 2000 14:08:22 -0800 (PST)
Received: from nctimes.net ([208.239.20.159]) by mail.nctimes.net
          (Netscape Messaging Server 3.5)  with ESMTP id AAA6CFD;
          Wed, 12 Jan 2000 14:04:54 -0800
Message-ID: <387CFB24.3180E7E@nctimes.net>
Date: Wed, 12 Jan 2000 14:07:32 -0800
From: Mark Perakh <perakh@nctimes.net>
Reply-To: perakh@nctimes.net
Organization: home
X-Mailer: Mozilla 4.7 [en] (Win98; I)
X-Accept-Language: en,ru
MIME-Version: 1.0
To: jguy@alphalink.com.au
Cc: Rene <Zandbergen@t-online.de>, voynich@rand.org
Subject: Re: Monkey texts
References: <128SHg-1p1NJ2C@fwd01.sul.t-online.de> <387D65A6.472B@alphalink.com.au>
Content-Type: text/plain; charset=koi8-r
Content-Transfer-Encoding: 7bit
Sender: jim@mail.rand.org
Status: OR

The story with Jacques monkey texts took place before I joined the list
therefore I am confused. Normally the term monkey text would be applied to a
gibberish where characters are placed randomly.  In such a case, the entropy
(of all orders) would be well higher than for any meaningful text.  Now I
read that Jacques program creates a monkey text preserving the entropy of a
meaningful text.  Would you kindly explain what it is all about? Cheers to
all, Mark

Jacques Guy wrote:

> Rene wrote:
>
> > I have a copy of Jacques' good old monkey program, but as far as I
> > understand, while it can generate arbitrary text with the same entropy
> > as a given source text, this can only be shown on the screen, not
> > saved in a fail. (Correct?)
>
> Non, m'sieur, pas correct. Just press the right arrow key,
> and the text you see on the screen gets sent to a file
> MONKEY.SEZ (the text so saved is highlighted in reverse video
> on the screen). Press the right arrow key again and it
> stops being saved to disk. Again and it is saved to disk,
> again and... am I describing a *toggle*???
>
> Monkey's gibbering is *appended* to the existing MONKEY.SEZ
> (of course, if there isn't one, Monkey creates one).
>
> I really ought to rewrite Monkey from scratch, in Euphoria
> (terrific language, Euphoria, no 64K limit, automatic
> garbage collection, no dangling pointers...)
>
> Frogguy

From jim@mail.rand.org  Wed Jan 12 19:54:17 2000
Return-Path: <jim@mail.rand.org>
Received: from mail-green.research.att.com (mail-green.research.att.com [135.207.30.103])
	by fry.research.att.com (980427.SGI.8.8.8/8.8.7) with ESMTP id TAA20008
	for <reeds@fry.research.att.com>; Wed, 12 Jan 2000 19:54:17 -0500 (EST)
Received: by mail-green.research.att.com (Postfix)
	id 7C8F01E007; Wed, 12 Jan 2000 19:54:17 -0500 (EST)
Delivered-To: reeds@research.att.com
Received: from mail01-lax.pilot.net (mail-lax-1.pilot.net [205.139.40.18])
	by mail-green.research.att.com (Postfix) with ESMTP id 0F24F1E003
	for <reeds@research.att.com>; Wed, 12 Jan 2000 19:54:17 -0500 (EST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail01-lax.pilot.net with ESMTP id QAA08976; Wed, 12 Jan 2000 16:54:13 -0800 (PST)
Received: from mail.rand.org (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id QAA13214; Wed, 12 Jan 2000 16:54:12 -0800 (PST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail.rand.org (8.9.3/8.9.3) with ESMTP id QAA00999 for <voynich@rand.org>; Wed, 12 Jan 2000 16:53:53 -0800 (PST)
Received: from mail01-lax.pilot.net (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id QAA13166 for <voynich@rand.org>; Wed, 12 Jan 2000 16:53:52 -0800 (PST)
Received: from mailout00.sul.t-online.de (mailout00.sul.t-online.de [194.25.134.16]) by mail01-lax.pilot.net with ESMTP id PAA01988 for <voynich@rand.org>; Wed, 12 Jan 2000 15:17:16 -0800 (PST)
Received: from fwd03.sul.t-online.de 
	by mailout00.sul.t-online.de with smtp 
	id 128X0t-0007jw-05; Thu, 13 Jan 2000 00:17:15 +0100
Received: from  (0625764225-0001@[62.156.38.109]) by fwd03.sul.t-online.de
	with smtp id 128X0r-1BKYdMC; Thu, 13 Jan 2000 00:17:13 +0100
From: Zandbergen@t-online.de (Rene)
To: voynich@rand.org
References: <128SHg-1p1NJ2C@fwd01.sul.t-online.de> <387D65A6.472B@alphalink.com.au>
Subject: Re:  Monkey texts
X-Mailer: T-Online eMail 2.3
MIME-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 8BIT
Date: Thu, 13 Jan 2000 00:17:13 +0100
Message-ID: <128X0r-1BKYdMC@fwd03.sul.t-online.de>
X-Sender: 0625764225-0001@t-dialin.net
Sender: jim@mail.rand.org
Status: OR

Frogguy responded:

> Non, m'sieur, pas correct. Just press the right arrow key,
> and the text you see on the screen gets sent to a file
> MONKEY.SEZ (the text so saved is highlighted in reverse video
> on the screen). ...

Oops. Can you say: RTFM?
Thanks!

> I really ought to rewrite Monkey from scratch, in Euphoria
> (terrific language, Euphoria, no 64K limit, automatic
> garbage collection, no dangling pointers...)

I can't resist. Try fortran. No limit, no garbage creation in
the first place, and no pointers either, dangling or otherwise.

Mark: what I am after is a text typed by a somewhat 'intelligent'
monkey. This will not just generate text based on a single-
character frequency table, but the probility of each character depends
on the previous 1, 2 or more characters. These would be 2nd or 3rd order 
monkeys respectively. Especially a 3rd order monkey process generates
garbage of which the underlying language is very very clear, even
though the resulting text is nonsensical. (There are nice examples
in a 1970s book by W. R. Bennett, who also looked at the Voynich MS - see
bibliography at Jim Reeds' web site).

I expect that such monkey texts will exhibit LSC sums with the
same profile as meaningful texts (but it will be interesting to see
which order it takes to do so). If not, well, then the LSC sums really
seem to quantify a much more 'intangible' form of meaning...

Note: I am taking for granted that the Voynich MS was *not* created as
a hoax by someone employing a similar technique as in Jacques' monkey
program. At least not before the current century.

Cheers, Rene

From jim@mail.rand.org  Wed Jan 12 21:11:53 2000
Return-Path: <jim@mail.rand.org>
Received: from mail-green.research.att.com (mail-green.research.att.com [135.207.30.103])
	by fry.research.att.com (980427.SGI.8.8.8/8.8.7) with ESMTP id VAA36782
	for <reeds@fry.research.att.com>; Wed, 12 Jan 2000 21:11:53 -0500 (EST)
Received: by mail-green.research.att.com (Postfix)
	id DA8251E032; Wed, 12 Jan 2000 21:11:52 -0500 (EST)
Delivered-To: reeds@research.att.com
Received: from mail02-lax.pilot.net (mail-lax-2.pilot.net [205.139.40.16])
	by mail-green.research.att.com (Postfix) with ESMTP id 4284C1E007
	for <reeds@research.att.com>; Wed, 12 Jan 2000 21:11:52 -0500 (EST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail02-lax.pilot.net with ESMTP id SAA24844; Wed, 12 Jan 2000 18:11:48 -0800 (PST)
Received: from mail.rand.org (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id SAA18487; Wed, 12 Jan 2000 18:11:47 -0800 (PST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail.rand.org (8.9.3/8.9.3) with ESMTP id SAA07448 for <voynich@rand.org>; Wed, 12 Jan 2000 18:11:37 -0800 (PST)
From: mskala@ansuz.sooke.bc.ca
Received: from mail03-lax.pilot.net (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id SAA18438 for <voynich@rand.org>; Wed, 12 Jan 2000 18:11:36 -0800 (PST)
Received: from ansuz.sooke.bc.ca (bbs.bbc.org [139.142.115.249]) by mail03-lax.pilot.net with ESMTP id QAA06482 for <voynich@rand.org>; Wed, 12 Jan 2000 16:13:16 -0800 (PST)
Received: from localhost (mskala@localhost)
	by ansuz.sooke.bc.ca (8.9.3/8.8.7) with ESMTP id PAA16649;
	Wed, 12 Jan 2000 15:40:31 -0800
Date: Wed, 12 Jan 2000 15:40:30 -0800 (PST)
To: Jorge Stolfi <stolfi@dcc.unicamp.br>
Cc: jguy@alphalink.com.au, voynich@rand.org
Subject: Re: An invented language (or a hoax?)
In-Reply-To: <200001122131.TAA01385@coruja.dcc.unicamp.br>
Message-ID: <Pine.LNX.4.10.10001121538060.16645-100000@ansuz.sooke.bc.ca>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: jim@mail.rand.org
Status: OR

On Wed, 12 Jan 2000, Jorge Stolfi wrote:
> Plus they have the problem of figuring out the reading order, which we
> fortunately don't have.

Are you sure?

It seems likely that someone would write in the order intended to be read,
and we can deduce the writing direction from the known limitations of pens
and ink... but a transposition cipher is plausible, and could still be
written flowingly if the writer made a draft first and copied it.

Matthew Skala                       "Ha!" said God, "I've got Jon Postel!"
mskala@ansuz.sooke.bc.ca            "Yes," said the Devil, "but *I've* got
http://www.islandnet.com/~mskala/    all the sysadmins!"

From jim@mail.rand.org  Wed Jan 12 21:20:02 2000
Return-Path: <jim@mail.rand.org>
Received: from mail-green.research.att.com (mail-green.research.att.com [135.207.30.103])
	by fry.research.att.com (980427.SGI.8.8.8/8.8.7) with ESMTP id VAA10185
	for <reeds@fry.research.att.com>; Wed, 12 Jan 2000 21:20:02 -0500 (EST)
Received: by mail-green.research.att.com (Postfix)
	id 535421E032; Wed, 12 Jan 2000 21:20:02 -0500 (EST)
Delivered-To: reeds@research.att.com
Received: from mail03-lax.pilot.net (mail-lax-3.pilot.net [205.139.40.17])
	by mail-green.research.att.com (Postfix) with ESMTP id BBAA71E007
	for <reeds@research.att.com>; Wed, 12 Jan 2000 21:20:01 -0500 (EST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail03-lax.pilot.net with ESMTP id SAA13839; Wed, 12 Jan 2000 18:19:58 -0800 (PST)
Received: from mail.rand.org (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id SAA19323; Wed, 12 Jan 2000 18:19:57 -0800 (PST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail.rand.org (8.9.3/8.9.3) with ESMTP id SAA08330 for <voynich@rand.org>; Wed, 12 Jan 2000 18:19:51 -0800 (PST)
Received: from mail02-lax.pilot.net (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id SAA19303 for <voynich@rand.org>; Wed, 12 Jan 2000 18:19:49 -0800 (PST)
Received: from mail.nctimes.net (mail.nctimes.net [208.239.16.66]) by mail02-lax.pilot.net with ESMTP id SAA26285 for <voynich@rand.org>; Wed, 12 Jan 2000 18:19:48 -0800 (PST)
Received: from nctimes.net ([208.239.20.82]) by mail.nctimes.net
          (Netscape Messaging Server 3.5)  with ESMTP id AAA10B9;
          Wed, 12 Jan 2000 18:16:22 -0800
Message-ID: <387D3615.9F530F7A@nctimes.net>
Date: Wed, 12 Jan 2000 18:19:01 -0800
From: Mark Perakh <perakh@nctimes.net>
Reply-To: perakh@nctimes.net
Organization: home
X-Mailer: Mozilla 4.7 [en] (Win98; I)
X-Accept-Language: en,ru
MIME-Version: 1.0
To: Rene <Zandbergen@t-online.de>
Cc: voynich@rand.org
Subject: Re: Monkey texts
References: <128SHg-1p1NJ2C@fwd01.sul.t-online.de> <387D65A6.472B@alphalink.com.au> <128X0r-1BKYdMC@fwd03.sul.t-online.de>
Content-Type: text/plain; charset=koi8-r
Content-Transfer-Encoding: 7bit
Sender: jim@mail.rand.org
Status: OR

Rene, thanks for clarification.  The type of "quasi-monkey" text you
described is something I did not encounter and I would be curious to see one
as well as to see results of an LSC test of such a text. It may show some
unexpected features, maybe sharpening the LSC tool to be applied to VMs and
thus get additonal info in regard to it being meaningful.    Cheers, Mark

Rene wrote:

> Frogguy responded:
>
> > Non, m'sieur, pas correct. Just press the right arrow key,
> > and the text you see on the screen gets sent to a file
> > MONKEY.SEZ (the text so saved is highlighted in reverse video
> > on the screen). ...
>
> Oops. Can you say: RTFM?
> Thanks!
>
> > I really ought to rewrite Monkey from scratch, in Euphoria
> > (terrific language, Euphoria, no 64K limit, automatic
> > garbage collection, no dangling pointers...)
>
> I can't resist. Try fortran. No limit, no garbage creation in
> the first place, and no pointers either, dangling or otherwise.
>
> Mark: what I am after is a text typed by a somewhat 'intelligent'
> monkey. This will not just generate text based on a single-
> character frequency table, but the probility of each character depends
> on the previous 1, 2 or more characters. These would be 2nd or 3rd order
> monkeys respectively. Especially a 3rd order monkey process generates
> garbage of which the underlying language is very very clear, even
> though the resulting text is nonsensical. (There are nice examples
> in a 1970s book by W. R. Bennett, who also looked at the Voynich MS - see
> bibliography at Jim Reeds' web site).
>
> I expect that such monkey texts will exhibit LSC sums with the
> same profile as meaningful texts (but it will be interesting to see
> which order it takes to do so). If not, well, then the LSC sums really
> seem to quantify a much more 'intangible' form of meaning...
>
> Note: I am taking for granted that the Voynich MS was *not* created as
> a hoax by someone employing a similar technique as in Jacques' monkey
> program. At least not before the current century.
>
> Cheers, Rene

From jim@mail.rand.org  Wed Jan 12 22:37:10 2000
Return-Path: <jim@mail.rand.org>
Received: from mail-blue.research.att.com (mail-blue.research.att.com [135.207.30.102])
	by fry.research.att.com (980427.SGI.8.8.8/8.8.7) with ESMTP id WAA01656
	for <reeds@fry.research.att.com>; Wed, 12 Jan 2000 22:37:10 -0500 (EST)
Received: by mail-blue.research.att.com (Postfix)
	id 1937D4CE13; Wed, 12 Jan 2000 22:37:10 -0500 (EST)
Delivered-To: reeds@research.att.com
Received: from mail01-lax.pilot.net (mail-lax-1.pilot.net [205.139.40.18])
	by mail-blue.research.att.com (Postfix) with ESMTP id 796A54CE05
	for <reeds@research.att.com>; Wed, 12 Jan 2000 22:37:09 -0500 (EST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail01-lax.pilot.net with ESMTP id TAA16813; Wed, 12 Jan 2000 19:37:06 -0800 (PST)
Received: from mail.rand.org (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id TAA22446; Wed, 12 Jan 2000 19:37:05 -0800 (PST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail.rand.org (8.9.3/8.9.3) with ESMTP id TAA11277 for <voynich@rand.org>; Wed, 12 Jan 2000 19:36:54 -0800 (PST)
From: CHRYSIPPVS@aol.com
Received: from mail03-lax.pilot.net (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id TAA22433 for <voynich@rand.org>; Wed, 12 Jan 2000 19:36:53 -0800 (PST)
Received: from imo24.mx.aol.com (imo24.mx.aol.com [152.163.225.68]) by mail03-lax.pilot.net with ESMTP id TAA26685 for <voynich@rand.org>; Wed, 12 Jan 2000 19:36:52 -0800 (PST)
Received: from CHRYSIPPVS@aol.com
	by imo24.mx.aol.com (mail_out_v24.6.) id 6.c0.6a25c3 (4446)
	 for <voynich@rand.org>; Wed, 12 Jan 2000 22:36:16 -0500 (EST)
Message-ID: <c0.6a25c3.25aea230@aol.com>
Date: Wed, 12 Jan 2000 22:36:16 EST
Subject: joining the list..
To: voynich@rand.org
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
X-Mailer: Windows AOL sub 45
Sender: jim@mail.rand.org
Status: OR

Hi my name is Justin Sledge and I am interested in going to list.  I am 18 
and working on this project so that I may get my foot in a possible academic 
field before college.  I have taught myself Latin, Greek, Hebrew, and Russian 
so I hope I can apply these to the VMS and I have also done considerable 
research about the Enochian language (which brought me to the VMS enigma).  I 
am not sure that I will be able to contribute much as I am far less educated 
than many on this list.  I get the Photostats from Yale on Monday and will 
get to cracking...already read D'imperio's book and the EVMT's transcription. 
 Well, I hope everyone well and hope I can contribute.

With my regards

Justin Sledge

From jim@mail.rand.org  Wed Jan 12 16:43:33 2000
Return-Path: <jim@mail.rand.org>
Received: from mail-blue.research.att.com (mail-blue.research.att.com [135.207.30.102])
	by fry.research.att.com (980427.SGI.8.8.8/8.8.7) with ESMTP id QAA84542
	for <reeds@fry.research.att.com>; Wed, 12 Jan 2000 16:43:33 -0500 (EST)
Received: by mail-blue.research.att.com (Postfix)
	id 4D16B4CE36; Wed, 12 Jan 2000 16:43:33 -0500 (EST)
Delivered-To: reeds@research.att.com
Received: from mail02-lax.pilot.net (mail-lax-2.pilot.net [205.139.40.16])
	by mail-blue.research.att.com (Postfix) with ESMTP id B35AF4CE33
	for <reeds@research.att.com>; Wed, 12 Jan 2000 16:43:32 -0500 (EST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail02-lax.pilot.net with ESMTP id NAA24376; Wed, 12 Jan 2000 13:43:28 -0800 (PST)
Received: from mail.rand.org (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id NAA29848; Wed, 12 Jan 2000 13:43:26 -0800 (PST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail.rand.org (8.9.3/8.9.3) with ESMTP id NAA09639 for <voynich@rand.org>; Wed, 12 Jan 2000 13:43:16 -0800 (PST)
Received: from mail02-lax.pilot.net (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id NAA29811 for <voynich@rand.org>; Wed, 12 Jan 2000 13:43:15 -0800 (PST)
Received: from mail.alphalink.com.au (mail.alphalink.com.au [203.24.205.7]) by mail02-lax.pilot.net with ESMTP id NAA24256 for <voynich@rand.org>; Wed, 12 Jan 2000 13:43:13 -0800 (PST)
Received: from LOCALNAME (d05-as14-mel.alphalink.com.au [202.161.98.36])
	by mail.alphalink.com.au (8.9.3/8.9.3) with SMTP id IAA11300;
	Thu, 13 Jan 2000 08:43:01 +1100
Message-ID: <387D65A6.472B@alphalink.com.au>
Date: Wed, 12 Jan 2000 21:41:58 -0800
From: Jacques Guy <jguy@alphalink.com.au>
Reply-To: jguy@alphalink.com.au
X-Mailer: Mozilla 3.01 (Win16; I)
MIME-Version: 1.0
To: Rene <Zandbergen@t-online.de>
Cc: voynich@rand.org
Subject: Re: Monkey texts
References: <128SHg-1p1NJ2C@fwd01.sul.t-online.de>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Sender: jim@mail.rand.org
Status: OR

Rene wrote:
 
> I have a copy of Jacques' good old monkey program, but as far as I
> understand, while it can generate arbitrary text with the same entropy
> as a given source text, this can only be shown on the screen, not
> saved in a fail. (Correct?)


Non, m'sieur, pas correct. Just press the right arrow key,
and the text you see on the screen gets sent to a file
MONKEY.SEZ (the text so saved is highlighted in reverse video
on the screen). Press the right arrow key again and it 
stops being saved to disk. Again and it is saved to disk,
again and... am I describing a *toggle*??? 

Monkey's gibbering is *appended* to the existing MONKEY.SEZ
(of course, if there isn't one, Monkey creates one).

I really ought to rewrite Monkey from scratch, in Euphoria
(terrific language, Euphoria, no 64K limit, automatic
garbage collection, no dangling pointers...)

Frogguy

From jim@mail.rand.org  Thu Jan 13 05:12:34 2000
Return-Path: <jim@mail.rand.org>
Received: from mail-green.research.att.com (mail-green.research.att.com [135.207.30.103])
	by fry.research.att.com (980427.SGI.8.8.8/8.8.7) with ESMTP id FAA78488
	for <reeds@fry.research.att.com>; Thu, 13 Jan 2000 05:12:34 -0500 (EST)
Received: by mail-green.research.att.com (Postfix)
	id 2CC5F1E00D; Thu, 13 Jan 2000 05:12:34 -0500 (EST)
Delivered-To: reeds@research.att.com
Received: from mail02-lax.pilot.net (mail-lax-2.pilot.net [205.139.40.16])
	by mail-green.research.att.com (Postfix) with ESMTP id 7ABD81E002
	for <reeds@research.att.com>; Thu, 13 Jan 2000 05:12:33 -0500 (EST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail02-lax.pilot.net with ESMTP id CAA28837; Thu, 13 Jan 2000 02:12:30 -0800 (PST)
Received: from mail.rand.org (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id CAA01454; Thu, 13 Jan 2000 02:12:29 -0800 (PST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail.rand.org (8.9.3/8.9.3) with ESMTP id CAA21441 for <voynich@rand.org>; Thu, 13 Jan 2000 02:10:52 -0800 (PST)
Received: from mail03-lax.pilot.net (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id CAA01412 for <voynich@rand.org>; Thu, 13 Jan 2000 02:10:51 -0800 (PST)
Received: from mailer3.bham.ac.uk (mailer3.bham.ac.uk [147.188.128.54]) by mail03-lax.pilot.net with ESMTP id CAA15877 for <voynich@rand.org>; Thu, 13 Jan 2000 02:10:50 -0800 (PST)
Received: from bham.ac.uk ([147.188.128.127])
	by mailer3.bham.ac.uk with esmtp (Exim 3.02 #16)
	id 128hDN-0007iQ-00
	for voynich@rand.org; Thu, 13 Jan 2000 10:10:49 +0000
Received: from is-fs13.bham.ac.uk ([147.188.128.55])
	by bham.ac.uk with esmtp (Exim 3.10 #1)
	id 128hDM-0006Hc-04
	for voynich@rand.org; Thu, 13 Jan 2000 10:10:48 +0000
Received: from BHAM-IS-FS13/SpoolDir by is-fs13.bham.ac.uk (Mercury 1.44);
    13 Jan 00 10:10:49 +0000
Received: from SpoolDir by BHAM-IS-FS13 (Mercury 1.44); 13 Jan 00 10:10:20 +0000
Received: from golem (147.188.72.20) by is-fs13.bham.ac.uk (Mercury 1.44) with ESMTP;
    13 Jan 00 10:08:55 +0000
From: "Gabriel Landini" <G.Landini@bham.ac.uk>
Organization: The University of Birmingham, UK.
To: voynich@rand.org
Date: Thu, 13 Jan 2000 10:07:29 -0000
MIME-Version: 1.0
Content-type: text/plain; charset=US-ASCII
Content-transfer-encoding: 7BIT
Subject: Re: Monkey texts
Reply-To: G.Landini@bham.ac.uk
In-reply-to: <387D3615.9F530F7A@nctimes.net>
X-mailer: Pegasus Mail for Win32 (v3.12a)
Message-ID: <8CD53E0F18@is-fs13.bham.ac.uk>
Sender: jim@mail.rand.org
Status: OR

On 12 Jan 00, at 18:19, Mark Perakh wrote:

> Rene, thanks for clarification.  The type of "quasi-monkey" text you
> described is something I did not encounter and I would be curious to
> see one as well as to see results of an LSC test of such a text. It
> may show some unexpected features, maybe sharpening the LSC tool to be
> applied to VMs and thus get additonal info in regard to it being
> meaningful.    Cheers, Mark

Wouldn't the LSC (still reading about it!) on higher-degree monkeys 
approach the properties of a truly meaningful text?

Cheers,

Gabriel
 

From jim@mail.rand.org  Thu Jan 13 11:31:07 2000
Return-Path: <jim@mail.rand.org>
Received: from mail-blue.research.att.com (mail-blue.research.att.com [135.207.30.102])
	by fry.research.att.com (980427.SGI.8.8.8/8.8.7) with ESMTP id LAA70436
	for <reeds@fry.research.att.com>; Thu, 13 Jan 2000 11:31:07 -0500 (EST)
Received: by mail-blue.research.att.com (Postfix)
	id DBDE74CE2E; Thu, 13 Jan 2000 11:31:06 -0500 (EST)
Delivered-To: reeds@research.att.com
Received: from mail01-lax.pilot.net (mail-lax-1.pilot.net [205.139.40.18])
	by mail-blue.research.att.com (Postfix) with ESMTP id 685D64CE11
	for <reeds@research.att.com>; Thu, 13 Jan 2000 11:31:06 -0500 (EST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail01-lax.pilot.net with ESMTP id IAA07633; Thu, 13 Jan 2000 08:31:00 -0800 (PST)
Received: from mail.rand.org (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id IAA16007; Thu, 13 Jan 2000 08:30:59 -0800 (PST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail.rand.org (8.9.3/8.9.3) with ESMTP id IAA05378 for <voynich@rand.org>; Thu, 13 Jan 2000 08:04:29 -0800 (PST)
Received: from mail01-lax.pilot.net (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id IAA14225 for <voynich@rand.org>; Thu, 13 Jan 2000 08:04:28 -0800 (PST)
Received: from mail.nctimes.net (mail.nctimes.net [208.239.16.66]) by mail01-lax.pilot.net with ESMTP id IAA27790 for <voynich@rand.org>; Thu, 13 Jan 2000 08:04:27 -0800 (PST)
Received: from nctimes.net ([208.239.20.7]) by mail.nctimes.net
          (Netscape Messaging Server 3.5)  with ESMTP id AAA32E9;
          Thu, 13 Jan 2000 08:01:00 -0800
Message-ID: <387DF758.F8AC0C00@nctimes.net>
Date: Thu, 13 Jan 2000 08:03:37 -0800
From: Mark Perakh <perakh@nctimes.net>
Reply-To: perakh@nctimes.net
Organization: home
X-Mailer: Mozilla 4.7 [en] (Win98; I)
X-Accept-Language: en,ru
MIME-Version: 1.0
To: G.Landini@bham.ac.uk
Cc: voynich@rand.org
Subject: Re: Monkey texts
References: <8CD53E0F18@is-fs13.bham.ac.uk>
Content-Type: text/plain; charset=koi8-r
Content-Transfer-Encoding: 7bit
Sender: jim@mail.rand.org
Status: OR

Gabriel,  that is something I would like to know.  It may be the case, but I
am not making predictions.  Rene said he might conduct some LSC tests, so
let us wait and see.  Best, Mark

Gabriel Landini wrote:

> Wouldn't the LSC (still reading about it!) on higher-degree monkeys
> approach the properties of a truly meaningful text?
>
> Cheers,
>
> Gabriel
>

From jim@mail.rand.org  Thu Jan 13 11:22:49 2000
Return-Path: <jim@mail.rand.org>
Received: from mail-blue.research.att.com (mail-blue.research.att.com [135.207.30.102])
	by fry.research.att.com (980427.SGI.8.8.8/8.8.7) with ESMTP id LAA61014
	for <reeds@fry.research.att.com>; Thu, 13 Jan 2000 11:22:49 -0500 (EST)
Received: by mail-blue.research.att.com (Postfix)
	id 98EF24CE2F; Thu, 13 Jan 2000 11:22:49 -0500 (EST)
Delivered-To: reeds@research.att.com
Received: from mail01-lax.pilot.net (mail-lax-1.pilot.net [205.139.40.18])
	by mail-blue.research.att.com (Postfix) with ESMTP id 1B4D44CE2E
	for <reeds@research.att.com>; Thu, 13 Jan 2000 11:22:49 -0500 (EST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail01-lax.pilot.net with ESMTP id IAA04195; Thu, 13 Jan 2000 08:22:45 -0800 (PST)
Received: from mail.rand.org (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id IAA14741; Thu, 13 Jan 2000 08:22:43 -0800 (PST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail.rand.org (8.9.3/8.9.3) with ESMTP id IAA07029 for <voynich@rand.org>; Thu, 13 Jan 2000 08:22:38 -0800 (PST)
Received: from mail02-lax.pilot.net (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id IAA14718 for <voynich@rand.org>; Thu, 13 Jan 2000 08:22:37 -0800 (PST)
Received: from cts.com (adamsd@crash.cts.com [192.188.72.17]) by mail02-lax.pilot.net with ESMTP id IAA15985 for <voynich@rand.org>; Thu, 13 Jan 2000 08:22:18 -0800 (PST)
Received: by cts.com (8.9.3/8.9.3) id IAA05605;
	Thu, 13 Jan 2000 08:22:07 -0800 (PST)
From: Adams Douglas <adamsd@cts.com>
Message-Id: <200001131622.IAA05605@cts.com>
Subject: Re: joining the list..
To: CHRYSIPPVS@aol.com
Date: Thu, 13 Jan 2000 08:22:07 -0800 (PST)
Cc: voynich@rand.org
In-Reply-To: <c0.6a25c3.25aea230@aol.com> from "CHRYSIPPVS@aol.com" at Jan 12, 2000 10:36:16 PM
X-Mailer: ELM [version 2.5 PL2]
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Sender: jim@mail.rand.org
Status: OR

CHRYSIPPVS@aol.com wrote:
> I have taught myself Latin, Greek, Hebrew, and Russian 
> so I hope I can apply these to the VMS and I have also done considerable 
> research about the Enochian language (which brought me to the VMS enigma).  I 
> am not sure that I will be able to contribute much as I am far less educated 
> than many on this list.  I get the Photostats from Yale on Monday and will 
> get to cracking...already read D'imperio's book and the EVMT's transcription. 
>  Well, I hope everyone well and hope I can contribute.

You've certainly done your homework, Justin. :)

-Adams

-- 
====================================================
Adams Douglas, San Diego, CA   Adams@Douglas.net
http://Adams.Douglas.net/
PGP Public Keys: http://Adams.Douglas.net/pgpkey.txt
UTM:11S0487200 3623500 MGRS-2:11SMS872235 (100-meter)

               "Geography is only physics slowed down            
                with a few trees stuck on it."
                                   ---Terry Pratchett

From jim@mail.rand.org  Thu Jan 13 18:33:48 2000
Return-Path: <jim@mail.rand.org>
Received: from mail-green.research.att.com (mail-green.research.att.com [135.207.30.103])
	by fry.research.att.com (980427.SGI.8.8.8/8.8.7) with ESMTP id SAA26186
	for <reeds@fry.research.att.com>; Thu, 13 Jan 2000 18:33:48 -0500 (EST)
Received: by mail-green.research.att.com (Postfix)
	id 7BA181E028; Thu, 13 Jan 2000 18:33:48 -0500 (EST)
Delivered-To: reeds@research.att.com
Received: from mail02-lax.pilot.net (mail-lax-2.pilot.net [205.139.40.16])
	by mail-green.research.att.com (Postfix) with ESMTP id A11621E027
	for <reeds@research.att.com>; Thu, 13 Jan 2000 18:33:47 -0500 (EST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail02-lax.pilot.net with ESMTP id PAA15158; Thu, 13 Jan 2000 15:33:42 -0800 (PST)
Received: from mail.rand.org (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id PAA21205; Thu, 13 Jan 2000 15:33:40 -0800 (PST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail.rand.org (8.9.3/8.9.3) with ESMTP id PAA11232 for <voynich@rand.org>; Thu, 13 Jan 2000 15:32:58 -0800 (PST)
Received: from mail02-lax.pilot.net (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id PAA21090 for <voynich@rand.org>; Thu, 13 Jan 2000 15:32:57 -0800 (PST)
Received: from mailout03.sul.t-online.de (mailout03.sul.t-online.de [194.25.134.81]) by mail02-lax.pilot.net with ESMTP id PAA14882 for <voynich@rand.org>; Thu, 13 Jan 2000 15:32:55 -0800 (PST)
Received: from fwd01.sul.t-online.de 
	by mailout03.sul.t-online.de with smtp 
	id 128tjb-0001ia-00; Fri, 14 Jan 2000 00:32:55 +0100
Received: from  (0625764225-0001@[62.156.39.125]) by fwd01.sul.t-online.de
	with smtp id 128tjX-0uW6E4C; Fri, 14 Jan 2000 00:32:51 +0100
From: Zandbergen@t-online.de (Rene)
To: voynich@rand.org
Subject: LSC sums for monkey texts
X-Mailer: T-Online eMail 2.3
MIME-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 8BIT
Date: Fri, 14 Jan 2000 00:32:51 +0100
Message-ID: <128tjX-0uW6E4C@fwd01.sul.t-online.de>
X-Sender: 0625764225-0001@t-dialin.net
Sender: jim@mail.rand.org
Status: OR

Dear all,

I have indeed done a few initial tests with the LSC technique on 
texts generated by Jacques' monkey program.
Tentatively, I would say that the technique still sees a difference
between real meaningful text and a 3rd or 4th order character monkey.
Not very conclusive but quite promising. 
I used text length of only 20,000 characters, which is not enough
to be absolutely sure of the conclusion.
See a quick summary of what I did at:
http://www.geocities.com/voynichms/mylsc.html

There are some plots in which the X-scale has numbers from 1 to 19.
These are 'codes' and represent the following actual values:

1, 2, 3, 5, 7, 10, 15, 20, 30, 50, 70, 100, 150, 200, 300, etc.

I should rewrite the code in C so that I can run it at home.
And I would also appreciate if Mark could send me one of his sample
texts so that I can validate the results of my program.

One more comment: spaces were removed from all source texts. With the
spaces included as an additional 'character', the sums change 
considerably.

Comments are welcome.

Cheers, Rene

From jim@mail.rand.org  Thu Jan 13 20:28:08 2000
Return-Path: <jim@mail.rand.org>
Received: from mail-green.research.att.com (mail-green.research.att.com [135.207.30.103])
	by fry.research.att.com (980427.SGI.8.8.8/8.8.7) with ESMTP id UAA85350
	for <reeds@fry.research.att.com>; Thu, 13 Jan 2000 20:28:08 -0500 (EST)
Received: by mail-green.research.att.com (Postfix)
	id 92DCA1E027; Thu, 13 Jan 2000 20:28:08 -0500 (EST)
Delivered-To: reeds@research.att.com
Received: from mail02-lax.pilot.net (mail-lax-2.pilot.net [205.139.40.16])
	by mail-green.research.att.com (Postfix) with ESMTP id 091371E018
	for <reeds@research.att.com>; Thu, 13 Jan 2000 20:28:08 -0500 (EST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail02-lax.pilot.net with ESMTP id RAA21015; Thu, 13 Jan 2000 17:28:03 -0800 (PST)
Received: from mail.rand.org (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id RAA27724; Thu, 13 Jan 2000 17:28:00 -0800 (PST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail.rand.org (8.9.3/8.9.3) with ESMTP id RAA22302 for <voynich@rand.org>; Thu, 13 Jan 2000 17:27:23 -0800 (PST)
Received: from mail03-lax.pilot.net (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id RAA27681 for <voynich@rand.org>; Thu, 13 Jan 2000 17:27:22 -0800 (PST)
Received: from mail.nctimes.net (mail.nctimes.net [208.239.16.66]) by mail03-lax.pilot.net with ESMTP id RAA00875 for <voynich@rand.org>; Thu, 13 Jan 2000 17:27:22 -0800 (PST)
Received: from nctimes.net ([208.239.20.89]) by mail.nctimes.net
          (Netscape Messaging Server 3.5)  with ESMTP id AAA6DB0;
          Thu, 13 Jan 2000 17:23:55 -0800
Message-ID: <387E7B4C.8C9E6266@nctimes.net>
Date: Thu, 13 Jan 2000 17:26:36 -0800
From: Mark Perakh <perakh@nctimes.net>
Reply-To: perakh@nctimes.net
Organization: home
X-Mailer: Mozilla 4.7 [en] (Win98; I)
X-Accept-Language: en,ru
MIME-Version: 1.0
To: Rene <Zandbergen@t-online.de>
Cc: voynich@rand.org
Subject: Re: LSC sums for monkey texts
References: <128tjX-0uW6E4C@fwd01.sul.t-online.de>
Content-Type: text/plain; charset=koi8-r
Content-Transfer-Encoding: 7bit
Sender: jim@mail.rand.org
Status: OR

Rene, thanks for the message. Sorry, I have to leave in a few minutes,
but tomorrow the first thing in the morning I'll email to you some texts
you requested. At first glance, LSC test you conducted  very clearly
shows that your monkey texts have a quite different distribution of
letter frequency variability as compared to meaningful texts, but are
similar in some respects to the gibberish I explored.  VMs on the other
hand has LSC curves precisely like those for meaningful texts.  It looks
like LSC can serve as a sharp tool, do you agree?  Best, Mark

Rene wrote:

> Dear all,
>
> I have indeed done a few initial tests with the LSC technique on
> texts generated by Jacques' monkey program.
> Tentatively, I would say that the technique still sees a difference
> between real meaningful text and a 3rd or 4th order character monkey.
> Not very conclusive but quite promising.
> I used text length of only 20,000 characters, which is not enough
> to be absolutely sure of the conclusion.
> See a quick summary of what I did at:
> http://www.geocities.com/voynichms/mylsc.html
>
> There are some plots in which the X-scale has numbers from 1 to 19.
> These are 'codes' and represent the following actual values:
>
> 1, 2, 3, 5, 7, 10, 15, 20, 30, 50, 70, 100, 150, 200, 300, etc.
>
> I should rewrite the code in C so that I can run it at home.
> And I would also appreciate if Mark could send me one of his sample
> texts so that I can validate the results of my program.
>
> One more comment: spaces were removed from all source texts. With the
> spaces included as an additional 'character', the sums change
> considerably.
>
> Comments are welcome.
>
> Cheers, Rene

From jim@mail.rand.org  Fri Jan 14 02:17:00 2000
Return-Path: <jim@mail.rand.org>
Received: from mail-blue.research.att.com (mail-blue.research.att.com [135.207.30.102])
	by fry.research.att.com (980427.SGI.8.8.8/8.8.7) with ESMTP id CAA47945
	for <reeds@fry.research.att.com>; Fri, 14 Jan 2000 02:17:00 -0500 (EST)
Received: by mail-blue.research.att.com (Postfix)
	id 925864CE12; Fri, 14 Jan 2000 02:17:00 -0500 (EST)
Delivered-To: reeds@research.att.com
Received: from mail03-lax.pilot.net (mail-lax-3.pilot.net [205.139.40.17])
	by mail-blue.research.att.com (Postfix) with ESMTP id 831754CE07
	for <reeds@research.att.com>; Fri, 14 Jan 2000 02:16:59 -0500 (EST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail03-lax.pilot.net with ESMTP id XAA29275; Thu, 13 Jan 2000 23:15:57 -0800 (PST)
Received: from mail.rand.org (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id XAA08866; Thu, 13 Jan 2000 23:15:55 -0800 (PST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail.rand.org (8.9.3/8.9.3) with ESMTP id XAA03628 for <voynich@rand.org>; Thu, 13 Jan 2000 23:15:44 -0800 (PST)
Received: from mail02-lax.pilot.net (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id XAA08853 for <voynich@rand.org>; Thu, 13 Jan 2000 23:15:43 -0800 (PST)
Received: from mail.nctimes.net (mail.nctimes.net [208.239.16.66]) by mail02-lax.pilot.net with ESMTP id XAA13045 for <voynich@rand.org>; Thu, 13 Jan 2000 23:15:42 -0800 (PST)
Received: from nctimes.net ([208.239.20.71]) by mail.nctimes.net
          (Netscape Messaging Server 3.5)  with ESMTP id AAA3EF2;
          Thu, 13 Jan 2000 23:12:11 -0800
Message-ID: <387ECCC3.58AF4C42@nctimes.net>
Date: Thu, 13 Jan 2000 23:14:11 -0800
From: Mark Perakh <perakh@nctimes.net>
Reply-To: perakh@nctimes.net
Organization: home
X-Mailer: Mozilla 4.7 [en] (Win98; I)
X-Accept-Language: en,ru
MIME-Version: 1.0
To: Rene <Zandbergen@t-online.de>
Cc: voynich@rand.org
Subject: Re: LSC sums for monkey texts
References: <128tjX-0uW6E4C@fwd01.sul.t-online.de>
Content-Type: text/plain; charset=koi8-r
Content-Transfer-Encoding: 7bit
Sender: jim@mail.rand.org
Status: OR

Rene, I have looked at your curves and noticed the following features:
the Se curves (calculated) are exactly as we obtained, so in regard to Se
your program seems to work the same way. 2) If I understand it correctly
(and if not you correct me) what you call 1st order monkey is actually a
random permutation of letters of the original text.  Indeed, the Sm LSC
sum looks like those we obtained for such permutations 3) The higher
order monkeys are (if I understood it correctly) results of random
permutations of n-tuples of letters.  The fourth order monkey is then
somehow similar to our texts obtained by random permutations of words.
Indeed, the Sm curves for 4-order monkey looks rather similar to our
word-shuffled texts. 4) What is puzzling is the Sm curve for your
original Latin text.  It is like our typical Sm curves for meaningful
texts (including Genesis in Latin) at small n, but is rather different at
large n. For all meaningful texts we obtained a well expressed growth of
Sm at n exceeding that for well formed PMP.  In your example PMP seems to
be not well formed and there is no typical rise of Sm toward large n. In
order to find the reason for that, I'll email to tomorrow you some texts
we used (including VMS-A and VMS-B). If you conduct LSC text on them
using your program we'll be able to see if you obtain the same curves we
did or your program works differently.  I would like to say that our
program was tested and retested very meticulously and we are confident it
measures OK.  So, either you encountered a Latin text which is peculiar
in regard to LSC, or something is wrong with the program.  Yes, our
program ignored the spaces, commas, etc. Cheers, Mark

Rene wrote:

> Dear all,
>
> I have indeed done a few initial tests with the LSC technique on
> texts generated by Jacques' monkey program.
> Tentatively, I would say that the technique still sees a difference
> between real meaningful text and a 3rd or 4th order character monkey.
> Not very conclusive but quite promising.
> I used text length of only 20,000 characters, which is not enough
> to be absolutely sure of the conclusion.
> See a quick summary of what I did at:
> http://www.geocities.com/voynichms/mylsc.html
>
> There are some plots in which the X-scale has numbers from 1 to 19.
> These are 'codes' and represent the following actual values:
>
> 1, 2, 3, 5, 7, 10, 15, 20, 30, 50, 70, 100, 150, 200, 300, etc.
>
> I should rewrite the code in C so that I can run it at home.
> And I would also appreciate if Mark could send me one of his sample
> texts so that I can validate the results of my program.
>
> One more comment: spaces were removed from all source texts. With the
> spaces included as an additional 'character', the sums change
> considerably.
>
> Comments are welcome.
>
> Cheers, Rene

From jim@mail.rand.org  Fri Jan 14 02:51:18 2000
Return-Path: <jim@mail.rand.org>
Received: from mail-green.research.att.com (mail-green.research.att.com [135.207.30.103])
	by fry.research.att.com (980427.SGI.8.8.8/8.8.7) with ESMTP id CAA09157
	for <reeds@fry.research.att.com>; Fri, 14 Jan 2000 02:51:18 -0500 (EST)
Received: by mail-green.research.att.com (Postfix)
	id EABC71E028; Fri, 14 Jan 2000 02:51:17 -0500 (EST)
Delivered-To: reeds@research.att.com
Received: from mail02-lax.pilot.net (mail-lax-2.pilot.net [205.139.40.16])
	by mail-green.research.att.com (Postfix) with ESMTP id 5EE7E1E018
	for <reeds@research.att.com>; Fri, 14 Jan 2000 02:51:17 -0500 (EST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail02-lax.pilot.net with ESMTP id XAA17297; Thu, 13 Jan 2000 23:50:15 -0800 (PST)
Received: from mail.rand.org (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id XAA09405; Thu, 13 Jan 2000 23:50:14 -0800 (PST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail.rand.org (8.9.3/8.9.3) with ESMTP id XAA04044 for <voynich@rand.org>; Thu, 13 Jan 2000 23:50:08 -0800 (PST)
Received: from mail01-lax.pilot.net (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id XAA09385 for <voynich@rand.org>; Thu, 13 Jan 2000 23:50:07 -0800 (PST)
Received: from mail.nctimes.net (mail.nctimes.net [208.239.16.66]) by mail01-lax.pilot.net with ESMTP id XAA13232 for <voynich@rand.org>; Thu, 13 Jan 2000 23:38:31 -0800 (PST)
Received: from nctimes.net ([208.239.20.12]) by mail.nctimes.net
          (Netscape Messaging Server 3.5)  with ESMTP id AAA4696;
          Thu, 13 Jan 2000 23:35:04 -0800
Message-ID: <387ED244.171118EC@nctimes.net>
Date: Thu, 13 Jan 2000 23:37:40 -0800
From: Mark Perakh <perakh@nctimes.net>
Reply-To: perakh@nctimes.net
Organization: home
X-Mailer: Mozilla 4.7 [en] (Win98; I)
X-Accept-Language: en,ru
MIME-Version: 1.0
To: Rene <Zandbergen@t-online.de>
Cc: voynich@rand.org
Subject: Re: LSC sums for monkey texts
References: <128tjX-0uW6E4C@fwd01.sul.t-online.de>
Content-Type: text/plain; charset=koi8-r
Content-Transfer-Encoding: 7bit
Sender: jim@mail.rand.org
Status: OR



Rene, we can have a double-check if you email to me your Latin text of
20,000 letters, so we'll be able to test it using our program and see if
the results are the same as you obtained.  Mark

From jim@mail.rand.org  Fri Jan 14 12:47:01 2000
Return-Path: <jim@mail.rand.org>
Received: from mail-green.research.att.com (mail-green.research.att.com [135.207.30.103])
	by fry.research.att.com (980427.SGI.8.8.8/8.8.7) with ESMTP id MAA61948
	for <reeds@fry.research.att.com>; Fri, 14 Jan 2000 12:47:01 -0500 (EST)
Received: by mail-green.research.att.com (Postfix)
	id 367381E04C; Fri, 14 Jan 2000 12:41:59 -0500 (EST)
Delivered-To: reeds@research.att.com
Received: from mail02-lax.pilot.net (mail-lax-2.pilot.net [205.139.40.16])
	by mail-green.research.att.com (Postfix) with ESMTP id 416511E027
	for <reeds@research.att.com>; Fri, 14 Jan 2000 12:41:58 -0500 (EST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail02-lax.pilot.net with ESMTP id JAA27065; Fri, 14 Jan 2000 09:40:32 -0800 (PST)
Received: from mail.rand.org (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id JAA04272; Fri, 14 Jan 2000 09:40:29 -0800 (PST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail.rand.org (8.9.3/8.9.3) with ESMTP id JAA07903 for <voynich@rand.org>; Fri, 14 Jan 2000 09:39:55 -0800 (PST)
Received: from mail02-lax.pilot.net (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id JAA04149 for <voynich@rand.org>; Fri, 14 Jan 2000 09:39:54 -0800 (PST)
Received: from hercules.acsu.buffalo.edu (qmailr@hercules.acsu.buffalo.edu [128.205.7.123]) by mail02-lax.pilot.net with SMTP id JAA26439 for <voynich@rand.org>; Fri, 14 Jan 2000 09:39:53 -0800 (PST)
Received: (qmail 13796 invoked from network); 14 Jan 2000 17:39:50 -0000
Received: from ubppp-247-012.ppp-net.buffalo.edu (HELO bob) (128.205.247.12)
  by hercules.acsu.buffalo.edu with SMTP; 14 Jan 2000 17:39:50 -0000
Message-Id: <3.0.5.32.20000114123912.00861bc0@pop.acsu.buffalo.edu>
X-Sender: dmharms@pop.acsu.buffalo.edu
X-Mailer: QUALCOMM Windows Eudora Light Version 3.0.5 (32)
Date: Fri, 14 Jan 2000 12:39:12 -0500
To: voynich@rand.org
From: Daniel Harms <dmharms@acsu.buffalo.edu>
Subject: A little hoax
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Sender: jim@mail.rand.org
Status: OR

Here's a little hoax (most likely) that was thrown the way of the
Lovecraftian community a few days ago.  My Greek is non-existent, so
I submit it for those who are interested in such things:

http://www.geocities.com/laurabertini/


Daniel Harms     dmharms@acsu.buffalo.edu
The Internet:  Learn what you know.  Share what you don't.

From jim@mail.rand.org  Fri Jan 14 16:54:14 2000
Return-Path: <jim@mail.rand.org>
Received: from mail-green.research.att.com (mail-green.research.att.com [135.207.30.103])
	by fry.research.att.com (980427.SGI.8.8.8/8.8.7) with ESMTP id QAA17146
	for <reeds@fry.research.att.com>; Fri, 14 Jan 2000 16:54:14 -0500 (EST)
Received: by mail-green.research.att.com (Postfix)
	id DB7D71E04B; Fri, 14 Jan 2000 16:54:13 -0500 (EST)
Delivered-To: reeds@research.att.com
Received: from mail01-lax.pilot.net (mail-lax-1.pilot.net [205.139.40.18])
	by mail-green.research.att.com (Postfix) with ESMTP id 61E461E043
	for <reeds@research.att.com>; Fri, 14 Jan 2000 16:54:13 -0500 (EST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail01-lax.pilot.net with ESMTP id NAA29056; Fri, 14 Jan 2000 13:54:09 -0800 (PST)
Received: from mail.rand.org (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id NAA27340; Fri, 14 Jan 2000 13:54:08 -0800 (PST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail.rand.org (8.9.3/8.9.3) with ESMTP id NAA12252 for <voynich@rand.org>; Fri, 14 Jan 2000 13:53:55 -0800 (PST)
Received: from mail03-lax.pilot.net (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id NAA27292 for <voynich@rand.org>; Fri, 14 Jan 2000 13:53:54 -0800 (PST)
Received: from mailout04.sul.t-online.de (mailout04.sul.t-online.de [194.25.134.18]) by mail03-lax.pilot.net with ESMTP id NAA07312 for <voynich@rand.org>; Fri, 14 Jan 2000 13:53:53 -0800 (PST)
Received: from fwd05.sul.t-online.de 
	by mailout04.sul.t-online.de with smtp 
	id 129EfI-00046C-03; Fri, 14 Jan 2000 22:53:52 +0100
Received: from  (0625764225-0001@[62.158.124.15]) by fwd05.sul.t-online.de
	with smtp id 129EfE-0supZQC; Fri, 14 Jan 2000 22:53:48 +0100
From: Zandbergen@t-online.de (Rene)
To: voynich@rand.org
References: <128tjX-0uW6E4C@fwd01.sul.t-online.de> <387ECCC3.58AF4C42@nctimes.net>
Subject: Re:  LSC sums for monkey texts
X-Mailer: T-Online eMail 2.3
MIME-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 8BIT
Date: Fri, 14 Jan 2000 22:53:48 +0100
Message-ID: <129EfE-0supZQC@fwd05.sul.t-online.de>
X-Sender: 0625764225-0001@t-dialin.net
Sender: jim@mail.rand.org
Status: OR

Hello Mark,

> Rene, I have looked at your curves and noticed the following features:
> the Se curves (calculated) are exactly as we obtained, so in regard to Se
> your program seems to work the same way. 2) If I understand it correctly
> (and if not you correct me) what you call 1st order monkey is actually a
> random permutation of letters of the original text.

It is not significantly different from that. It is a computer-generated
text with a single-character frequency equal to that of the source text.
All characters are generated independently from each other (i.e. a
permutation _with_ replacement.)
 
> Indeed, the Sm LSC sum looks like those we obtained for such
> permutations

> 3) The higher order monkeys are (if I understood it correctly) results
> of random permutations of n-tuples of letters.

Again: almost. Taking the case of the 3rd order process, what the Monkey
program does is making a table of all character triplets in the source
text. The computer-generated text is generated character by character,
where the probability of each new character depends on the two preceding
ones, and it follows the distribution of all triplets in the source
text with the same pair of initial characters.

> The fourth order monkey is then
> somehow similar to our texts obtained by random permutations of words.

Due to the fact that the source text is really *much* too short to
use a 4th order monkey properly, this text will indeed tend to exist
of small chunks from the source text all mixed up.
I must look again at your tests for texts with the words mixed up.
I would expect such a text to be 'nearer to meaningful' than a
4th order monkey text.

> Indeed, the Sm curves for 4-order monkey looks rather similar to our
> word-shuffled texts. 4) What is puzzling is the Sm curve for your
> original Latin text.  It is like our typical Sm curves for meaningful
> texts (including Genesis in Latin) at small n, but is rather different at
> large n. For all meaningful texts we obtained a well expressed growth of
> Sm at n exceeding that for well formed PMP.  In your example PMP seems to
> be not well formed and there is no typical rise of Sm toward large n. In
> order to find the reason for that, I'll email to tomorrow you some texts
> we used (including VMS-A and VMS-B). If you conduct LSC text on them
> using your program we'll be able to see if you obtain the same curves we
> did or your program works differently.

Yes. 
I will email you the text I used, and also the table of Sm and Se
values resulting from it. I suspect that the text length plays a 
major role. The jitter for higher values of 'n' in several of the
graphs makes me think that the text may have been a bit short.
Today I just ran one case: an English text of about 700,000 characters,
and the Sm curve was very smooth and went up to over 4*Se for n=50,000

> I would like to say that our program was tested and retested
> very meticulously and we are confident it measures OK.  So,
> either you encountered a Latin text which is peculiar
> in regard to LSC, or something is wrong with the program.

I do not doubt for a moment that your program is reliable,
which is why I would like to try mine on your source texts and
compare with the numbers in your articles.

More later,
        Rene

From jim@mail.rand.org  Fri Jan 14 23:46:11 2000
Return-Path: <jim@mail.rand.org>
Received: from mail-green.research.att.com (mail-green.research.att.com [135.207.30.103])
	by fry.research.att.com (980427.SGI.8.8.8/8.8.7) with ESMTP id XAA62962
	for <reeds@fry.research.att.com>; Fri, 14 Jan 2000 23:46:11 -0500 (EST)
Received: by mail-green.research.att.com (Postfix)
	id E25811E0BA; Fri, 14 Jan 2000 23:46:10 -0500 (EST)
Delivered-To: reeds@research.att.com
Received: from mail01-lax.pilot.net (mail-lax-1.pilot.net [205.139.40.18])
	by mail-green.research.att.com (Postfix) with ESMTP id 625571E01B
	for <reeds@research.att.com>; Fri, 14 Jan 2000 23:46:10 -0500 (EST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail01-lax.pilot.net with ESMTP id UAA17271; Fri, 14 Jan 2000 20:46:05 -0800 (PST)
Received: from mail.rand.org (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id UAA18154; Fri, 14 Jan 2000 20:46:04 -0800 (PST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail.rand.org (8.9.3/8.9.3) with ESMTP id UAA09000 for <voynich@rand.org>; Fri, 14 Jan 2000 20:45:30 -0800 (PST)
Received: from mail01-lax.pilot.net (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id UAA18127 for <voynich@rand.org>; Fri, 14 Jan 2000 20:45:29 -0800 (PST)
Received: from callisto.acsu.buffalo.edu (qmailr@callisto.acsu.buffalo.edu [128.205.7.122]) by mail01-lax.pilot.net with SMTP id SAA04593 for <voynich@rand.org>; Fri, 14 Jan 2000 18:24:23 -0800 (PST)
Received: (qmail 4587 invoked from network); 15 Jan 2000 02:24:20 -0000
Received: from ubppp-245-003.ppp-net.buffalo.edu (HELO bob) (128.205.245.3)
  by callisto.acsu.buffalo.edu with SMTP; 15 Jan 2000 02:24:20 -0000
Message-Id: <3.0.5.32.20000114212342.00b18620@pop.acsu.buffalo.edu>
X-Sender: dmharms@pop.acsu.buffalo.edu
X-Mailer: QUALCOMM Windows Eudora Light Version 3.0.5 (32)
Date: Fri, 14 Jan 2000 21:23:42 -0500
To: voynich@rand.org
From: Daniel Harms <dmharms@acsu.buffalo.edu>
Subject: And another...
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Sender: jim@mail.rand.org
Status: OR

	I just remembered this, which might be of interest to the list re.
invented scripts:

	One of the first Necronomicon hoaxes on the market was the
_Al Azif_, published by Owlswick Press and with a foreward by L. Sprague
de Camp.  This was purportedly written in a language called "Duriac" (a
relative of Arabic), though in fact the script was created by a calligrapher 
who repeated it every sixteen pages, with a few extras at the beginning and 
the end.  The publishers slapped on a piece stating that several scholars from
Iraq had been killed when attempting to translate it, and sold it.

	The odd thing is, I met someone who worked for Owlswick Press
a few months ago.  He stated that the script probably did have some sort
meaning.  Not being any sort of cryptographer, or even knowing where to 
begin, I never looked into this.

	The book can sometimes be found in some of the larger libraries
across the United States, if anyone's interested in taking a look for
themselves.

Yrs.,


Daniel Harms     dmharms@acsu.buffalo.edu
The Internet:  Learn what you know.  Share what you don't.

From jim@mail.rand.org  Sat Jan 15 06:41:35 2000
Return-Path: <jim@mail.rand.org>
Received: from mail-blue.research.att.com (mail-blue.research.att.com [135.207.30.102])
	by fry.research.att.com (980427.SGI.8.8.8/8.8.7) with ESMTP id GAA84612
	for <reeds@fry.research.att.com>; Sat, 15 Jan 2000 06:41:34 -0500 (EST)
Received: by mail-blue.research.att.com (Postfix)
	id 0EB5A4CE5A; Sat, 15 Jan 2000 06:41:34 -0500 (EST)
Delivered-To: reeds@research.att.com
Received: from mail03-lax.pilot.net (mail-lax-3.pilot.net [205.139.40.17])
	by mail-blue.research.att.com (Postfix) with ESMTP id 3422C4CE57
	for <reeds@research.att.com>; Sat, 15 Jan 2000 06:41:33 -0500 (EST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail03-lax.pilot.net with ESMTP id DAA13488; Sat, 15 Jan 2000 03:41:29 -0800 (PST)
Received: from mail.rand.org (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id DAA25189; Sat, 15 Jan 2000 03:41:28 -0800 (PST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail.rand.org (8.9.3/8.9.3) with ESMTP id DAA18065 for <voynich@rand.org>; Sat, 15 Jan 2000 03:41:05 -0800 (PST)
Received: from mail03-lax.pilot.net (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id DAA25161 for <voynich@rand.org>; Sat, 15 Jan 2000 03:41:05 -0800 (PST)
Received: from ns1.ovis.net (root@ns1.ovis.net [207.0.147.2]) by mail03-lax.pilot.net with ESMTP id DAA13467 for <voynich@rand.org>; Sat, 15 Jan 2000 03:41:04 -0800 (PST)
Received: from ovis.net (s44.pm5.ovis.net [207.0.147.110])
	by ns1.ovis.net (8.9.3/8.9.3) with ESMTP id GAA15885;
	Sat, 15 Jan 2000 06:40:59 -0500
Message-ID: <38805DD5.2FEE5575@ovis.net>
Date: Sat, 15 Jan 2000 06:45:25 -0500
From: Steve Kudlak <chromexa@ovis.net>
Reply-To: chromexa@ovis.net
X-Mailer: Mozilla 4.5 [en]C-CCK-MCD ezn/58/n  (Win98; U)
X-Accept-Language: en
MIME-Version: 1.0
To: Daniel Harms <dmharms@acsu.buffalo.edu>
Cc: voynich@rand.org
Subject: Re: And another...
References: <3.0.5.32.20000114212342.00b18620@pop.acsu.buffalo.edu>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Sender: jim@mail.rand.org
Status: OR



Daniel Harms wrote:

>         I just remembered this, which might be of interest to the list re.
> invented scripts:
>
>         One of the first Necronomicon hoaxes on the market was the
> _Al Azif_, published by Owlswick Press and with a foreward by L. Sprague
> de Camp.  This was purportedly written in a language called "Duriac" (a
> relative of Arabic), though in fact the script was created by a calligrapher
> who repeated it every sixteen pages, with a few extras at the beginning and
> the end.  The publishers slapped on a piece stating that several scholars from
> Iraq had been killed when attempting to translate it, and sold it.
>
>         The odd thing is, I met someone who worked for Owlswick Press
> a few months ago.  He stated that the script probably did have some sort
> meaning.  Not being any sort of cryptographer, or even knowing where to
> begin, I never looked into this.
>
>         The book can sometimes be found in some of the larger libraries
> across the United States, if anyone's interested in taking a look for
> themselves.
>
> Yrs.,
>
> Daniel Harms     dmharms@acsu.buffalo.edu
> The Internet:  Learn what you know.  Share what you don't.

> Steve sez: Add Knowledge+confision and mix :)

Anyway, this I would love to see. I have several copies of the Necronomicon, and
most are not that well done. Some follow the European Standards, which have
"spells" for making one as small as possible, large as possible etc. Others
followed the PostLovecraftian tradition, in that I think Lovecraft would have
frowned on anything remotely sexual. This sort of imagery was very popular and the
late 1970s and 1980s, perhaps fed on by the the magazines Metal
Hurlant(French)/Heavy Metal(US/UK?). I know several of the publishers had
connections to these magazines. I saw snippets of "Arabic" and in my very very
weak arabic believed I could make out "immense" or something like that. But that
was it.

Has anyone seen any proported "intermediate translations" of the Necronomicon into
Latin etc. Or extant "editions:)" of the other books by Lovecraft and his friends.
After seeing an interesting though distrubing  edition/hoax which I did not
purchase(though widh I had) I would have attempted the "sex with demons" search
though, at this point I fear I would find tons of porn pages, or anti-porn, and
the authors now older probably would want to sort of put that sort of work in the
background.

By the way, this is kind of far afield from the VMS. Does anyone do serious
research (via the net) into the Necronimicon. I know there are lots of web pages,
but haven't seen much serious. I guess I could start with with S.T. Joshi and any
searches that descend from that.

Have Fun,
Sends Steve


From jim@mail.rand.org  Sat Jan 15 09:43:36 2000
Return-Path: <jim@mail.rand.org>
Received: from mail-green.research.att.com (mail-green.research.att.com [135.207.30.103])
	by fry.research.att.com (980427.SGI.8.8.8/8.8.7) with ESMTP id JAA52978
	for <reeds@fry.research.att.com>; Sat, 15 Jan 2000 09:43:36 -0500 (EST)
Received: by mail-green.research.att.com (Postfix)
	id D7D731E090; Sat, 15 Jan 2000 08:51:43 -0500 (EST)
Delivered-To: reeds@research.att.com
Received: from mail03-lax.pilot.net (mail-lax-3.pilot.net [205.139.40.17])
	by mail-green.research.att.com (Postfix) with ESMTP id 4FE6E1E04F
	for <reeds@research.att.com>; Sat, 15 Jan 2000 08:51:43 -0500 (EST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail03-lax.pilot.net with ESMTP id FAA18761; Sat, 15 Jan 2000 05:51:39 -0800 (PST)
Received: from mail.rand.org (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id FAA26718; Sat, 15 Jan 2000 05:51:38 -0800 (PST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail.rand.org (8.9.3/8.9.3) with ESMTP id FAA19242 for <voynich@rand.org>; Sat, 15 Jan 2000 05:51:24 -0800 (PST)
Received: from mail03-lax.pilot.net (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id FAA26693 for <voynich@rand.org>; Sat, 15 Jan 2000 05:51:22 -0800 (PST)
Received: from mailer3.bham.ac.uk (mailer3.bham.ac.uk [147.188.128.54]) by mail03-lax.pilot.net with ESMTP id FAA18732 for <voynich@rand.org>; Sat, 15 Jan 2000 05:51:22 -0800 (PST)
Received: from bham.ac.uk ([147.188.128.127])
	by mailer3.bham.ac.uk with esmtp (Exim 3.02 #16)
	id 129Tbb-0001dk-00; Sat, 15 Jan 2000 13:51:03 +0000
Received: from is-fs13.bham.ac.uk ([147.188.128.55])
	by bham.ac.uk with esmtp (Exim 3.10 #1)
	id 129Tbb-0002KY-01; Sat, 15 Jan 2000 13:51:03 +0000
Received: from BHAM-IS-FS13/SpoolDir by is-fs13.bham.ac.uk (Mercury 1.44);
    15 Jan 00 13:51:03 +0000
Received: from SpoolDir by BHAM-IS-FS13 (Mercury 1.44); 15 Jan 00 13:50:36 +0000
Received: from oemcomputer (147.188.135.5) by is-fs13.bham.ac.uk (Mercury 1.44) with ESMTP;
    15 Jan 00 13:50:31 +0000
From: "Gabriel Landini" <G.Landini@bham.ac.uk>
Organization: The University of Birmingham, U.K.
To: jguy@alphalink.com.au, voynich@rand.org
Date: Sat, 15 Jan 2000 13:50:51 -0000
MIME-Version: 1.0
Content-type: text/plain; charset=US-ASCII
Content-transfer-encoding: 7BIT
Subject: Re: LSC sums for monkey texts
Reply-To: G.Landini@bham.ac.uk
In-reply-to: <38807E3D.68D@alphalink.com.au>
X-mailer: Pegasus Mail for Win32 (v3.12b)
Message-ID: <1543F2044F@is-fs13.bham.ac.uk>
Sender: jim@mail.rand.org
Status: OR

On 15 Jan 00, at 6:03, Jacques Guy wrote:
> I vaguely suspect that
> LSC sums would distinguish between real Rotokas and
> second-order Monkey Rotokas. Third-order and beyond,
> I am not so sure.
> What do you think?

I think that the LSC depends heavily on the construction of words, 
But also think that word construction (because of Zipf's law) 
depends heavily on a sub-set of the word pool.

Long-range correlations in codes was discussed in DNA a couple 
of years ago in very prestigious Journals like Nature and Science, 
but to date I do not think that anybody had a convincing theory or 
explanation of the meaning and validity of the results.

If you think, really what is the relation (in any terms) of a piece of 
text which is many characters away from another? What is the 
large scale structure of a text?  That would mean that there are 
events at a small scales and also at larger scales. 
I can imagine that up to the sentence level or so there may be 
patterns or correlations (what we call grammar?), but beyond that, I 
am not sure.
Think of a dictionnary, there may not be any structure beyond 1 
sentence or definition (still Roget's Thesaurus coforms Zipf's law for 
the more frequent words). Consequently I see no reason why there 
should be any large scale structures in texts. (I may be very wrong).

I suggested the other day that higher-order Monkeys generate LSC 
which are closer and closer to that of the language the Monkeys 
are based on. If I understand correct, Rene's analysis seems to 
confirm that?

I guess that the LSC could not differentiate between, let's say, an 
"order 3 word-Monkey" and a real text. (Word Monkeys generate a 
language based on the probability of words, rather than 
characters). Note that 3rd order word-Monkeys usually generate 
readable, (meaningless and most of the time hilarious) text.
Perhaps this is worth looking into.

Cheers,

Gabriel


From jim@mail.rand.org  Sat Jan 15 01:22:20 2000
Return-Path: <jim@mail.rand.org>
Received: from mail-green.research.att.com (mail-green.research.att.com [135.207.30.103])
	by fry.research.att.com (980427.SGI.8.8.8/8.8.7) with ESMTP id BAA41063
	for <reeds@fry.research.att.com>; Sat, 15 Jan 2000 01:22:20 -0500 (EST)
Received: by mail-green.research.att.com (Postfix)
	id A33791E064; Sat, 15 Jan 2000 01:05:35 -0500 (EST)
Delivered-To: reeds@research.att.com
Received: from mail01-lax.pilot.net (mail-lax-1.pilot.net [205.139.40.18])
	by mail-green.research.att.com (Postfix) with ESMTP id 279C91E060
	for <reeds@research.att.com>; Sat, 15 Jan 2000 01:05:35 -0500 (EST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail01-lax.pilot.net with ESMTP id WAA22505; Fri, 14 Jan 2000 22:05:31 -0800 (PST)
Received: from mail.rand.org (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id WAA20543; Fri, 14 Jan 2000 22:05:30 -0800 (PST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail.rand.org (8.9.3/8.9.3) with ESMTP id WAA11440 for <voynich@rand.org>; Fri, 14 Jan 2000 22:05:20 -0800 (PST)
Received: from mail03-lax.pilot.net (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id WAA20530 for <voynich@rand.org>; Fri, 14 Jan 2000 22:05:19 -0800 (PST)
Received: from mail.alphalink.com.au (mail.alphalink.com.au [203.24.205.7]) by mail03-lax.pilot.net with ESMTP id WAA25818 for <voynich@rand.org>; Fri, 14 Jan 2000 22:05:17 -0800 (PST)
Received: from LOCALNAME (d22-as16-mel.alphalink.com.au [202.161.98.117])
	by mail.alphalink.com.au (8.9.3/8.9.3) with SMTP id RAA12190
	for <voynich@rand.org>; Sat, 15 Jan 2000 17:04:47 +1100
Message-ID: <38807E3D.68D@alphalink.com.au>
Date: Sat, 15 Jan 2000 06:03:41 -0800
From: Jacques Guy <jguy@alphalink.com.au>
Reply-To: jguy@alphalink.com.au
X-Mailer: Mozilla 3.01 (Win16; I)
MIME-Version: 1.0
To: voynich@rand.org
Subject: Re: LSC sums for monkey texts
References: <128tjX-0uW6E4C@fwd01.sul.t-online.de> <387E7B4C.8C9E6266@nctimes.net>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Sender: jim@mail.rand.org
Status: OR

I am trying to think of properties of language,
real languages, and construct falsiable theories.

Take those languages where you cannot have two
consonant in a row, and where a consonant must
always be followed by a vowel. Almost all 
Polynesian languages, many Austronesian languages,
lots of Papuan languages are like that.

A second-order letter monkey will produce two-letter
sequences in them with the  same relative frequencies
as the original it is aping. I remember  my colleague
Donald Laycock that, in some Papuan languages like
that, 80% of possible words existed in the language.
(A language such as Rotokas, with just 6 consonants
and five vowels, can ill afford not to make use
of every possible word-form). I vaguely suspect that
LSC sums would distinguish between real Rotokas and
second-order Monkey Rotokas. Third-order and beyond,
I am not so sure.

What do you think?

I am sure that, with a bit of effort, I can find
at least Luke or Matthews in Rotokas. But before
putting it  to the test, I must do more  thinking,
to emit a hypothesis that will be stand or fall.

Frogguy, with a thinking cap firmly glued on.

From jim@mail.rand.org  Sat Jan 15 19:52:47 2000
Return-Path: <jim@mail.rand.org>
Received: from mail-blue.research.att.com (mail-blue.research.att.com [135.207.30.102])
	by fry.research.att.com (980427.SGI.8.8.8/8.8.7) with ESMTP id TAA54646
	for <reeds@fry.research.att.com>; Sat, 15 Jan 2000 19:52:46 -0500 (EST)
Received: by mail-blue.research.att.com (Postfix)
	id 8FF244CE10; Sat, 15 Jan 2000 19:52:46 -0500 (EST)
Delivered-To: reeds@research.att.com
Received: from mail01-lax.pilot.net (mail-lax-1.pilot.net [205.139.40.18])
	by mail-blue.research.att.com (Postfix) with ESMTP id E60384CE01
	for <reeds@research.att.com>; Sat, 15 Jan 2000 19:52:45 -0500 (EST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail01-lax.pilot.net with ESMTP id QAA23762; Sat, 15 Jan 2000 16:52:42 -0800 (PST)
Received: from mail.rand.org (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id QAA07695; Sat, 15 Jan 2000 16:52:40 -0800 (PST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail.rand.org (8.9.3/8.9.3) with ESMTP id QAA01728 for <voynich@rand.org>; Sat, 15 Jan 2000 16:52:32 -0800 (PST)
Received: from mail01-lax.pilot.net (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id QAA07652 for <voynich@rand.org>; Sat, 15 Jan 2000 16:52:31 -0800 (PST)
Received: from mail.nctimes.net (mail.nctimes.net [208.239.16.66]) by mail01-lax.pilot.net with ESMTP id IAA18475 for <voynich@rand.org>; Sat, 15 Jan 2000 08:49:02 -0800 (PST)
Received: from nctimes.net ([208.239.20.45]) by mail.nctimes.net
          (Netscape Messaging Server 3.5)  with ESMTP id AAA2A9;
          Sat, 15 Jan 2000 08:45:30 -0800
Message-ID: <3880A4A2.3ACAEB2D@nctimes.net>
Date: Sat, 15 Jan 2000 08:47:30 -0800
From: Mark Perakh <perakh@nctimes.net>
Reply-To: perakh@nctimes.net
Organization: home
X-Mailer: Mozilla 4.7 [en] (Win98; I)
X-Accept-Language: en,ru
MIME-Version: 1.0
To: G.Landini@bham.ac.uk
Cc: jguy@alphalink.com.au, voynich@rand.org
Subject: Re: LSC sums for monkey texts
References: <1543F2044F@is-fs13.bham.ac.uk>
Content-Type: text/plain; charset=koi8-r
Content-Transfer-Encoding: 7bit
Sender: jim@mail.rand.org
Status: OR

First, I submit that, from LSC viewpoint, there is no principal difference
between monkey texts on the one hand and permuted texts on the other.
First order monkey behaves practically like a letter-permuted text, n
order monkey behaves like a text permuted in n-tuple chunks, to all
intents and purposes. In both cases the stock of letters available is
limited to that in the original meaningful text (what we call identity
permutation).  Therefore the statistics in both cases is that without
replacement.   In case of LSC it does not make a quanitattive difference
because the LSC expected sum differs between <with replacement> (that is
multinomial distribution) and <without replacement> (that is
hypergeometric distribution) only by a factor of L/(L-1) where L is the
length of the text expressed in number of letters.  If L>>1 the difference
is negligible.  There is no reason to believe that the measured sums will
differ to a much larger extent. Therefore I submit that the behavior of
monkey texts can be reasonably foreseen from the data for permuted texts.
These data showed that LSC distinguishes quite well between original
meaningful text and its permutations (letter- words- , and verses
permutations alike). The preliminary results by Rene seem to confirm that
expectation. As to Rotokas, I have no idea about it, but why not just to
try LSC on it?  Rene has now a program which, as we have verified,
measures LSC sums well.  Best to all, Mark

Gabriel Landini wrote:

> On 15 Jan 00, at 6:03, Jacques Guy wrote:
> > I vaguely suspect that
> > LSC sums would distinguish between real Rotokas and
> > second-order Monkey Rotokas. Third-order and beyond,
> > I am not so sure.
> > What do you think?
>
> I think that the LSC depends heavily on the construction of words,
> But also think that word construction (because of Zipf's law)
> depends heavily on a sub-set of the word pool.
>
> Long-range correlations in codes was discussed in DNA a couple
> of years ago in very prestigious Journals like Nature and Science,
> but to date I do not think that anybody had a convincing theory or
> explanation of the meaning and validity of the results.
>
> If you think, really what is the relation (in any terms) of a piece of
> text which is many characters away from another? What is the
> large scale structure of a text?  That would mean that there are
> events at a small scales and also at larger scales.
> I can imagine that up to the sentence level or so there may be
> patterns or correlations (what we call grammar?), but beyond that, I
> am not sure.
> Think of a dictionnary, there may not be any structure beyond 1
> sentence or definition (still Roget's Thesaurus coforms Zipf's law for
> the more frequent words). Consequently I see no reason why there
> should be any large scale structures in texts. (I may be very wrong).
>
> I suggested the other day that higher-order Monkeys generate LSC
> which are closer and closer to that of the language the Monkeys
> are based on. If I understand correct, Rene's analysis seems to
> confirm that?
>
> I guess that the LSC could not differentiate between, let's say, an
> "order 3 word-Monkey" and a real text. (Word Monkeys generate a
> language based on the probability of words, rather than
> characters). Note that 3rd order word-Monkeys usually generate
> readable, (meaningless and most of the time hilarious) text.
> Perhaps this is worth looking into.
>
> Cheers,
>
> Gabriel

From jim@mail.rand.org  Sat Jan 15 14:23:44 2000
Return-Path: <jim@mail.rand.org>
Received: from mail-blue.research.att.com (mail-blue.research.att.com [135.207.30.102])
	by fry.research.att.com (980427.SGI.8.8.8/8.8.7) with ESMTP id OAA22174
	for <reeds@fry.research.att.com>; Sat, 15 Jan 2000 14:23:43 -0500 (EST)
Received: by mail-blue.research.att.com (Postfix)
	id BFDE94CE46; Sat, 15 Jan 2000 14:23:43 -0500 (EST)
Delivered-To: reeds@research.att.com
Received: from mail02-lax.pilot.net (mail-lax-2.pilot.net [205.139.40.16])
	by mail-blue.research.att.com (Postfix) with ESMTP id 4E3C44CE3B
	for <reeds@research.att.com>; Sat, 15 Jan 2000 14:23:43 -0500 (EST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail02-lax.pilot.net with ESMTP id LAA05875; Sat, 15 Jan 2000 11:23:35 -0800 (PST)
Received: from mail.rand.org (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id LAA01835; Sat, 15 Jan 2000 11:23:34 -0800 (PST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail.rand.org (8.9.3/8.9.3) with ESMTP id LAA25239 for <voynich@rand.org>; Sat, 15 Jan 2000 11:21:48 -0800 (PST)
Received: from mail01-lax.pilot.net (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id LAA01746 for <voynich@rand.org>; Sat, 15 Jan 2000 11:21:47 -0800 (PST)
Received: from mailer3.bham.ac.uk (mailer3.bham.ac.uk [147.188.128.54]) by mail01-lax.pilot.net with ESMTP id KAA03849 for <voynich@rand.org>; Sat, 15 Jan 2000 10:36:42 -0800 (PST)
Received: from bham.ac.uk ([147.188.128.127])
	by mailer3.bham.ac.uk with esmtp (Exim 3.02 #16)
	id 129Y41-00056w-00
	for voynich@rand.org; Sat, 15 Jan 2000 18:36:41 +0000
Received: from is-fs13.bham.ac.uk ([147.188.128.55])
	by bham.ac.uk with esmtp (Exim 3.10 #1)
	id 129Y41-0000rc-00
	for voynich@rand.org; Sat, 15 Jan 2000 18:36:41 +0000
Received: from BHAM-IS-FS13/SpoolDir by is-fs13.bham.ac.uk (Mercury 1.44);
    15 Jan 00 18:36:42 +0000
Received: from SpoolDir by BHAM-IS-FS13 (Mercury 1.44); 15 Jan 00 18:36:12 +0000
Received: from oemcomputer (147.188.135.4) by is-fs13.bham.ac.uk (Mercury 1.44) with ESMTP;
    15 Jan 00 18:36:03 +0000
From: "Gabriel Landini" <G.Landini@bham.ac.uk>
Organization: The University of Birmingham, U.K.
To: voynich@rand.org
Date: Sat, 15 Jan 2000 18:36:23 -0000
MIME-Version: 1.0
Content-type: text/plain; charset=US-ASCII
Content-transfer-encoding: 7BIT
Subject: Re: LSC sums for monkey texts
Reply-To: G.Landini@bham.ac.uk
In-reply-to: <3880A4A2.3ACAEB2D@nctimes.net>
X-mailer: Pegasus Mail for Win32 (v3.12b)
Message-ID: <1A06A430AF@is-fs13.bham.ac.uk>
Sender: jim@mail.rand.org
Status: OR

On 15 Jan 00, at 8:47, Mark Perakh 

> Therefore I submit that the behavior of monkey texts can be
> reasonably foreseen from the data for permuted texts. 

I am not sure that I follow. It is easier to generate a n-order 
character monkey text because you store only the probabilities 
and then you only generate the "next" character. How do we 
generate permuted n-plets and assure that the probabilities of the 
"plets" appearing at the boundaries of the newly permuted "plets" 
are also falling within the observed probabilities of the original 
language?

> These data showed that LSC distinguishes quite well between original
> meaningful text and its permutations (letter- words- , and verses
> permutations alike). 

It may be, but if you consider n-order WORD monkey texts, you 
lost all original meaning while the new text still it is readable (all 
sequences of 3 words in the new text *exist* in the original text) 
and therefore some grammar remains. That is why is "readable"; in 
order 1-word monkeys are just the words in random order and 
therefore grammar is lost.

I still suspect that LSC would not differentiate between, an
order 3 *word*-Monkey and a real text, but of course I haven't 
tested it.

Regards,
Gabriel


From jim@mail.rand.org  Sat Jan 15 18:13:49 2000
Return-Path: <jim@mail.rand.org>
Received: from mail-blue.research.att.com (mail-blue.research.att.com [135.207.30.102])
	by fry.research.att.com (980427.SGI.8.8.8/8.8.7) with ESMTP id SAA01778
	for <reeds@fry.research.att.com>; Sat, 15 Jan 2000 18:13:49 -0500 (EST)
Received: by mail-blue.research.att.com (Postfix)
	id 198B84CE09; Sat, 15 Jan 2000 18:13:49 -0500 (EST)
Delivered-To: reeds@research.att.com
Received: from mail02-lax.pilot.net (mail-lax-2.pilot.net [205.139.40.16])
	by mail-blue.research.att.com (Postfix) with ESMTP id 9E35A4CE01
	for <reeds@research.att.com>; Sat, 15 Jan 2000 18:13:48 -0500 (EST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail02-lax.pilot.net with ESMTP id PAA20971; Sat, 15 Jan 2000 15:13:45 -0800 (PST)
Received: from mail.rand.org (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id PAA05194; Sat, 15 Jan 2000 15:13:43 -0800 (PST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail.rand.org (8.9.3/8.9.3) with ESMTP id OAA28785 for <voynich@rand.org>; Sat, 15 Jan 2000 14:25:51 -0800 (PST)
Received: from mail03-lax.pilot.net (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id OAA04858 for <voynich@rand.org>; Sat, 15 Jan 2000 14:25:49 -0800 (PST)
Received: from mail.nctimes.net (mail.nctimes.net [208.239.16.66]) by mail03-lax.pilot.net with ESMTP id NAA23687 for <voynich@rand.org>; Sat, 15 Jan 2000 13:54:39 -0800 (PST)
Received: from nctimes.net ([208.239.20.158]) by mail.nctimes.net
          (Netscape Messaging Server 3.5)  with ESMTP id AAA78;
          Sat, 15 Jan 2000 13:51:09 -0800
Message-ID: <3880EC6B.BBD3C95D@nctimes.net>
Date: Sat, 15 Jan 2000 13:53:48 -0800
From: Mark Perakh <perakh@nctimes.net>
Reply-To: perakh@nctimes.net
Organization: home
X-Mailer: Mozilla 4.7 [en] (Win98; I)
X-Accept-Language: en,ru
MIME-Version: 1.0
To: G.Landini@bham.ac.uk
Cc: voynich@rand.org
Subject: Re: LSC sums for monkey texts
References: <1A06A430AF@is-fs13.bham.ac.uk>
Content-Type: text/plain; charset=koi8-r
Content-Transfer-Encoding: 7bit
Sender: jim@mail.rand.org
Status: OR



Gabriel Landini wrote:

> On 15 Jan 00, at 8:47, Mark Perakh wrote:
>
> > Therefore I submit that the behavior of monkey texts can be
> > reasonably foreseen from the data for permuted texts.
>
> I am not sure that I follow. It is easier to generate a n-order
> character monkey text because you store only the probabilities
> and then you only generate the "next" character.

Thanks for the comments.I did not discuss it in terms of it being easier
or harder to get. Computer does the job in both cases, the program exists,
so the question of easyiness seems to be moot.  I am not suggesting to use
permutations instead of monkey program.  My comments related only to the
question whether or not we can expect LSC to distinguish between
meaningful and monkey texts.  I believe the behavior of monkey texts from
the standpoint of LSC is expected to be quite similar to that of permuted
texts, therefore LSC is expected to work for monkeys as well as for
permutations.  I do not think LSC will distinguish between permuted and
monkey texts.  This is based of course on the assumption that the texts
are long enough so the actual frequencies of letter occurences are quite
close to their probabilities.

> How do we
> generate permuted n-plets and assure that the probabilities of the
> "plets" appearing at the boundaries of the newly permuted "plets"
> are also falling within the observed probabilities of the original
> language?
>

They may not.  This hardly matters for the question  whether or not LSC
will distinguish between n-plet monkey and meaningful text.

>
> >
> It may be, but if you consider n-order WORD monkey texts, you
> lost all original meaning while the new text still it is readable (all
> sequences of 3 words in the new text *exist* in the original text)
> and therefore some grammar remains. That is why is "readable"; in
> order 1-word monkeys are just the words in random order and
> therefore grammar is lost.
> I still suspect that LSC would not differentiate between, an
> order 3 *word*-Monkey and a real text, but of course I haven't
> tested it.

In my paper #5 there are LSC data for texts randomized in various ways.
One of them was to permute the entire verses (in Genesis) without
permuting words or letters within the verses.  Each verse contained
considerably more than 3 words. The LSC sum for such permuted text is
quite clearly different from the non-permuted meaningful text. The
difference is at relatively large chunk sizes, while at small n the sum
behaves very similar to a meaningful text. Of course it was expected
because at small n when the chunk's size is smaller than the verse's size,
each verse preserves its meaning so from the standoint of LSC it is just
another meaningful text.  When n is larger than the average verse's size,
shuffling the verses kills the long range order inherent in the meaningful
contents, and the LSC immediately reveals that.  N-order word monkey is
not principally different from a verse-shuffled text, from  the standpoint
of LSC.  Therefore I expect that LSC will show the difference between
n-order word monkey at chunk's size exceeding the order of monkey (which
for orders such as 3 and 4 is a rather small chunk's size n).As to the
letter monkeys, it is evem more reasonable to expect, as the difference
would be revealed already at rather small n.  Of course I can be wrong, so
let us wait until Rene obtains the data.
Cheers, Mark


From jim@mail.rand.org  Mon Jan 17 16:46:22 2000
Return-Path: <jim@mail.rand.org>
Received: from mail-green.research.att.com (mail-green.research.att.com [135.207.30.103])
	by fry.research.att.com (980427.SGI.8.8.8/8.8.7) with ESMTP id QAA74089
	for <reeds@fry.research.att.com>; Mon, 17 Jan 2000 16:46:22 -0500 (EST)
Received: by mail-green.research.att.com (Postfix)
	id 6E8661E035; Mon, 17 Jan 2000 16:46:22 -0500 (EST)
Delivered-To: reeds@research.att.com
Received: from mail02-lax.pilot.net (mail-lax-2.pilot.net [205.139.40.16])
	by mail-green.research.att.com (Postfix) with ESMTP id C2DBD1E033
	for <reeds@research.att.com>; Mon, 17 Jan 2000 16:46:21 -0500 (EST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail02-lax.pilot.net with ESMTP id NAA07226; Mon, 17 Jan 2000 13:46:17 -0800 (PST)
Received: from mail.rand.org (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id NAA05054; Mon, 17 Jan 2000 13:46:16 -0800 (PST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail.rand.org (8.9.3/8.9.3) with ESMTP id NAA20487 for <voynich@rand.org>; Mon, 17 Jan 2000 13:45:37 -0800 (PST)
Received: from mail01-lax.pilot.net (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id NAA05003 for <voynich@rand.org>; Mon, 17 Jan 2000 13:45:36 -0800 (PST)
Received: from mailout04.sul.t-online.de (mailout04.sul.t-online.de [194.25.134.18]) by mail01-lax.pilot.net with ESMTP id NAA11638 for <voynich@rand.org>; Mon, 17 Jan 2000 13:45:35 -0800 (PST)
Received: from fwd03.sul.t-online.de 
	by mailout04.sul.t-online.de with smtp 
	id 12AJxu-0004yS-02; Mon, 17 Jan 2000 22:45:34 +0100
Received: from  (0625764225-0001@[193.159.141.146]) by fwd03.sul.t-online.de
	with smtp id 12AJxt-0OiTYmC; Mon, 17 Jan 2000 22:45:33 +0100
From: Zandbergen@t-online.de (Rene)
To: voynich@rand.org
Subject: A few LSC comments
X-Mailer: T-Online eMail 2.3
MIME-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 8BIT
Date: Mon, 17 Jan 2000 22:45:33 +0100
Message-ID: <12AJxt-0OiTYmC@fwd03.sul.t-online.de>
X-Sender: 0625764225-0001@t-dialin.net
Sender: jim@mail.rand.org
Status: OR


I haven't yet been able to do any further calculations and comparisons,
but here are a few assorted comments.

First of all, Jim Reeds detected an error in the calculation of Se, but
it soon turned out that this is a typo on my web page.
Where I wrote:
Se = 2 (L* - k) ( 1 - SUM ( etc....   ))
it should have been:
Se = 2 (L* - n) ( 1 - SUM ( etc....   ))

Secondly, I agree with Gabriel that using a 3rd order word monkey
would be even more interesting in terms of checking the capabilities
of the LSC method in detecting meaningful text. On the other hand,
getting meaningful word entropy statistics is even more difficult
than getting 3rd order character entropy values, so the text from
a 3rd order word monkey will repeat the source text from which the 
statistics have been drawn much more closely than should be the
case. As before, a 1st order word monkey will be equivalent to a
random permutation of words, and if it is true (in a statistically
significant manner) that the LSC test distinguishes between one and
the other, we do have another useful piece of evidence w.r.t. the
Voynich MS text.

In view of the difficulty in judging the meaningfulness of my
quick tests using character monkeys, I'll add 95% (e.g.) confidence
intervals around the Sm curves for the random texts.
Some time in the next few days.

More later, Rene

From jim@mail.rand.org  Mon Jan 17 20:08:47 2000
Return-Path: <jim@mail.rand.org>
Received: from mail-blue.research.att.com (mail-blue.research.att.com [135.207.30.102])
	by fry.research.att.com (980427.SGI.8.8.8/8.8.7) with ESMTP id UAA21161
	for <reeds@fry.research.att.com>; Mon, 17 Jan 2000 20:08:46 -0500 (EST)
Received: by mail-blue.research.att.com (Postfix)
	id CB4364CE04; Mon, 17 Jan 2000 20:08:46 -0500 (EST)
Delivered-To: reeds@research.att.com
Received: from mail03-lax.pilot.net (mail-lax-3.pilot.net [205.139.40.17])
	by mail-blue.research.att.com (Postfix) with ESMTP id 51B0A4CE02
	for <reeds@research.att.com>; Mon, 17 Jan 2000 20:08:46 -0500 (EST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail03-lax.pilot.net with ESMTP id RAA02187; Mon, 17 Jan 2000 17:08:41 -0800 (PST)
Received: from mail.rand.org (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id RAA13585; Mon, 17 Jan 2000 17:08:40 -0800 (PST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail.rand.org (8.9.3/8.9.3) with ESMTP id RAA11464 for <voynich@rand.org>; Mon, 17 Jan 2000 17:08:22 -0800 (PST)
Received: from mail02-lax.pilot.net (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id RAA13503 for <voynich@rand.org>; Mon, 17 Jan 2000 17:08:21 -0800 (PST)
Received: from mail.nctimes.net (mail.nctimes.net [208.239.16.66]) by mail02-lax.pilot.net with ESMTP id PAA24287 for <voynich@rand.org>; Mon, 17 Jan 2000 15:25:46 -0800 (PST)
Received: from nctimes.net ([208.239.20.198]) by mail.nctimes.net
          (Netscape Messaging Server 3.5)  with ESMTP id AAA52DB;
          Mon, 17 Jan 2000 15:22:13 -0800
Message-ID: <3883A4C6.8F6A30F3@nctimes.net>
Date: Mon, 17 Jan 2000 15:24:54 -0800
From: Mark Perakh <perakh@nctimes.net>
Reply-To: perakh@nctimes.net
Organization: home
X-Mailer: Mozilla 4.7 [en] (Win98; I)
X-Accept-Language: en,ru
MIME-Version: 1.0
To: Rene <Zandbergen@t-online.de>
Cc: voynich@rand.org
Subject: Re: A few LSC comments
References: <12AJxt-0OiTYmC@fwd03.sul.t-online.de>
Content-Type: text/plain; charset=koi8-r
Content-Transfer-Encoding: 7bit
Sender: jim@mail.rand.org
Status: OR

Rene, thanks for the reply. I was pleasantly surprised that besides you and
Gabriel also Jim has looked into LSC.  I have not checked the  formula you
used  since having found that your Se curve coincided with our results, I
simply assumed that you just slighly changed notations but essentially used
our formula for Se.  You said you simplified our formula. I did not see it
that way, but it is up to everybody to change the formula's appearance as
one wishes as long as it yields the same result. As to the expected results
with order 3 word monkeys etc, I have said what I thought of that in my
previous reply to Gabriel, so I don't think I can add anything new and I
prefer to wait until you conduct the measurements and we'll see how it
works. Best to all, Mark

Rene wrote:

> I haven't yet been able to do any further calculations and comparisons,
> but here are a few assorted comments.
>
> First of all, Jim Reeds detected an error in the calculation of Se, but
> it soon turned out that this is a typo on my web page.
> Where I wrote:
> Se = 2 (L* - k) ( 1 - SUM ( etc....   ))
> it should have been:
> Se = 2 (L* - n) ( 1 - SUM ( etc....   ))
>
> Secondly, I agree with Gabriel that using a 3rd order word monkey
> would be even more interesting in terms of checking the capabilities
> of the LSC method in detecting meaningful text. On the other hand,
> getting meaningful word entropy statistics is even more difficult
> than getting 3rd order character entropy values, so the text from
> a 3rd order word monkey will repeat the source text from which the
> statistics have been drawn much more closely than should be the
> case. As before, a 1st order word monkey will be equivalent to a
> random permutation of words, and if it is true (in a statistically
> significant manner) that the LSC test distinguishes between one and
> the other, we do have another useful piece of evidence w.r.t. the
> Voynich MS text.
>
> In view of the difficulty in judging the meaningfulness of my
> quick tests using character monkeys, I'll add 95% (e.g.) confidence
> intervals around the Sm curves for the random texts.
> Some time in the next few days.
>
> More later, Rene

From jim@mail.rand.org  Tue Jan 18 07:30:28 2000
Return-Path: <jim@mail.rand.org>
Received: from mail-green.research.att.com (mail-green.research.att.com [135.207.30.103])
	by fry.research.att.com (980427.SGI.8.8.8/8.8.7) with ESMTP id HAA22907
	for <reeds@fry.research.att.com>; Tue, 18 Jan 2000 07:30:28 -0500 (EST)
Received: by mail-green.research.att.com (Postfix)
	id 4E9ED1E022; Tue, 18 Jan 2000 07:30:28 -0500 (EST)
Delivered-To: reeds@research.att.com
Received: from mail03-lax.pilot.net (mail-lax-3.pilot.net [205.139.40.17])
	by mail-green.research.att.com (Postfix) with ESMTP id BB01D1E01A
	for <reeds@research.att.com>; Tue, 18 Jan 2000 07:30:27 -0500 (EST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail03-lax.pilot.net with ESMTP id EAA23263; Tue, 18 Jan 2000 04:30:21 -0800 (PST)
Received: from mail.rand.org (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id EAA29201; Tue, 18 Jan 2000 04:30:20 -0800 (PST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail.rand.org (8.9.3/8.9.3) with ESMTP id EAA01093 for <voynich@rand.org>; Tue, 18 Jan 2000 04:30:08 -0800 (PST)
Received: from mail03-lax.pilot.net (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id EAA29138 for <voynich@rand.org>; Tue, 18 Jan 2000 04:30:06 -0800 (PST)
Received: from mailer3.bham.ac.uk (mailer3.bham.ac.uk [147.188.128.54]) by mail03-lax.pilot.net with ESMTP id EAA22778 for <voynich@rand.org>; Tue, 18 Jan 2000 04:26:14 -0800 (PST)
Received: from bham.ac.uk ([147.188.128.127])
	by mailer3.bham.ac.uk with esmtp (Exim 3.02 #16)
	id 12AXi0-0004N9-00
	for voynich@rand.org; Tue, 18 Jan 2000 12:26:04 +0000
Received: from is-fs13.bham.ac.uk ([147.188.128.55])
	by bham.ac.uk with esmtp (Exim 3.10 #1)
	id 12AXi0-0000iQ-01
	for voynich@rand.org; Tue, 18 Jan 2000 12:26:04 +0000
Received: from BHAM-IS-FS13/SpoolDir by is-fs13.bham.ac.uk (Mercury 1.44);
    18 Jan 00 12:26:02 +0000
Received: from SpoolDir by BHAM-IS-FS13 (Mercury 1.44); 18 Jan 00 12:25:46 +0000
Received: from golem (147.188.72.20) by is-fs13.bham.ac.uk (Mercury 1.44) with ESMTP;
    18 Jan 00 12:25:38 +0000
From: "Gabriel Landini" <G.Landini@bham.ac.uk>
Organization: The University of Birmingham, UK.
To: voynich@rand.org
Date: Tue, 18 Jan 2000 12:24:12 -0000
MIME-Version: 1.0
Content-type: text/plain; charset=US-ASCII
Content-transfer-encoding: 7BIT
Subject: doaro
Reply-To: G.Landini@bham.ac.uk
X-mailer: Pegasus Mail for Win32 (v3.12a)
Message-ID: <C7397E3F@is-fs13.bham.ac.uk>
Sender: jim@mail.rand.org
Status: O

Hi,
I cracked the vms and it is written in Portuguese. 
In folio 68r3, the 7 stars (Pleiades?)  from the constellation of 
Taurus reads "doaro" which is corresponds to the character 
substitution of "touro" (Is that correct Jorge?).
That's it. Now you can read the rest. :-)

Now, more seriously, If that cluster of stars is the Pleiades, then the 
nearest bright star should be Aldebaran. Am I correct?

regards,

Gabriel

From jim@mail.rand.org  Tue Jan 18 11:31:49 2000
Return-Path: <jim@mail.rand.org>
Received: from mail-blue.research.att.com (mail-blue.research.att.com [135.207.30.102])
	by fry.research.att.com (980427.SGI.8.8.8/8.8.7) with ESMTP id LAA61687
	for <reeds@fry.research.att.com>; Tue, 18 Jan 2000 11:31:49 -0500 (EST)
Received: by mail-blue.research.att.com (Postfix)
	id F2D644CE1E; Tue, 18 Jan 2000 11:31:49 -0500 (EST)
Delivered-To: reeds@research.att.com
Received: from mail03-lax.pilot.net (mail-lax-3.pilot.net [205.139.40.17])
	by mail-blue.research.att.com (Postfix) with ESMTP id 835FD4CE1D
	for <reeds@research.att.com>; Tue, 18 Jan 2000 11:31:48 -0500 (EST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail03-lax.pilot.net with ESMTP id IAA23395; Tue, 18 Jan 2000 08:31:42 -0800 (PST)
Received: from mail.rand.org (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id IAA11186; Tue, 18 Jan 2000 08:31:40 -0800 (PST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail.rand.org (8.9.3/8.9.3) with ESMTP id IAA17498 for <voynich@rand.org>; Tue, 18 Jan 2000 08:31:23 -0800 (PST)
Received: from mail02-lax.pilot.net (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id IAA11133 for <voynich@rand.org>; Tue, 18 Jan 2000 08:31:21 -0800 (PST)
Received: from mailer3.bham.ac.uk (mailer3.bham.ac.uk [147.188.128.54]) by mail02-lax.pilot.net with ESMTP id IAA02148 for <voynich@rand.org>; Tue, 18 Jan 2000 08:31:20 -0800 (PST)
Received: from bham.ac.uk ([147.188.128.127])
	by mailer3.bham.ac.uk with esmtp (Exim 3.02 #16)
	id 12AbXG-0007dt-00
	for voynich@rand.org; Tue, 18 Jan 2000 16:31:14 +0000
Received: from is-fs13.bham.ac.uk ([147.188.128.55])
	by bham.ac.uk with esmtp (Exim 3.10 #1)
	id 12AbXG-0007iX-01
	for voynich@rand.org; Tue, 18 Jan 2000 16:31:14 +0000
Received: from BHAM-IS-FS13/SpoolDir by is-fs13.bham.ac.uk (Mercury 1.44);
    18 Jan 00 16:31:13 +0000
Received: from SpoolDir by BHAM-IS-FS13 (Mercury 1.44); 18 Jan 00 16:31:02 +0000
Received: from golem (147.188.72.20) by is-fs13.bham.ac.uk (Mercury 1.44) with ESMTP;
    18 Jan 00 16:30:53 +0000
From: "Gabriel Landini" <G.Landini@bham.ac.uk>
Organization: The University of Birmingham, UK.
To: voynich@rand.org
Date: Tue, 18 Jan 2000 16:29:26 -0000
MIME-Version: 1.0
Content-type: text/plain; charset=US-ASCII
Content-transfer-encoding: 7BIT
Subject: Re: doaro
Reply-To: G.Landini@bham.ac.uk
In-reply-to: <200001181610.OAA07240@coruja.dcc.unicamp.br>
References: <C7397E3F@is-fs13.bham.ac.uk>
X-mailer: Pegasus Mail for Win32 (v3.12a)
Message-ID: <2130820F4@is-fs13.bham.ac.uk>
Sender: jim@mail.rand.org
Status: OR

On 18 Jan 00, at 14:10, Jorge Stolfi wrote:

> "doaro" is to be read "subaru", which is the name of the Pleiades in
> Japanese.

> Actually, we both may be right. The spelling "abiril" for April is
> strong evidence that the author was Japanese (note the extra "i") and
> learned the Western month names in Portugal or Spain (thus "b" instead
> of "p").

It could be :-)

> BTW, that pretty much identifies the author as one of the three
> Japanese converts who toured Europe (chiefly Lisbon and Rome) from
> 1582 to 1590. Presumably the pilgrims left the VMS at Rome,

Hm... I thought that the author was Francis Xavier who when to 
Kagoshima (where I lived) in 1549. 
He died in China, so his vms must have been carried back to 
Europe by somebody else. This should narrow down the language 
to Mandarin, Cantonese or Japanese.

Seriously, if you read the following article, it says that he was 
involved with translations (most likely for religious purposes), 
perhaps translating Spanish/Latin into Japanese?

http://www.newadvent.org/cathen/06233b.htm

Cheers,

Gabriel


From jim@mail.rand.org  Tue Jan 18 11:44:53 2000
Return-Path: <jim@mail.rand.org>
Received: from mail-blue.research.att.com (mail-blue.research.att.com [135.207.30.102])
	by fry.research.att.com (980427.SGI.8.8.8/8.8.7) with ESMTP id LAA45526
	for <reeds@fry.research.att.com>; Tue, 18 Jan 2000 11:44:53 -0500 (EST)
Received: by mail-blue.research.att.com (Postfix)
	id 476554CE23; Tue, 18 Jan 2000 11:44:53 -0500 (EST)
Delivered-To: reeds@research.att.com
Received: from mail01-lax.pilot.net (mail-lax-1.pilot.net [205.139.40.18])
	by mail-blue.research.att.com (Postfix) with ESMTP id ADA944CE1E
	for <reeds@research.att.com>; Tue, 18 Jan 2000 11:44:52 -0500 (EST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail01-lax.pilot.net with ESMTP id IAA26227; Tue, 18 Jan 2000 08:44:48 -0800 (PST)
Received: from mail.rand.org (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id IAA12665; Tue, 18 Jan 2000 08:44:46 -0800 (PST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail.rand.org (8.9.3/8.9.3) with ESMTP id IAA19786 for <voynich@rand.org>; Tue, 18 Jan 2000 08:44:35 -0800 (PST)
Received: from mail02-lax.pilot.net (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id IAA12628 for <voynich@rand.org>; Tue, 18 Jan 2000 08:44:33 -0800 (PST)
Received: from grande.dcc.unicamp.br (grande.dcc.unicamp.br [143.106.7.8]) by mail02-lax.pilot.net with ESMTP id IAA07636 for <voynich@rand.org>; Tue, 18 Jan 2000 08:44:31 -0800 (PST)
Received: from amazonas.dcc.unicamp.br (amazonas.dcc.unicamp.br [143.106.7.11])
	by grande.dcc.unicamp.br (8.9.3/8.9.3) with ESMTP id OAA14188
	for <voynich@rand.org>; Tue, 18 Jan 2000 14:44:05 -0200 (EDT)
Received: from coruja.dcc.unicamp.br (coruja.dcc.unicamp.br [143.106.24.80])
	by amazonas.dcc.unicamp.br (8.8.5/8.8.5) with ESMTP id OAA22493
	for <voynich@rand.org>; Tue, 18 Jan 2000 14:44:03 -0200 (EDT)
Received: (from stolfi@localhost)
	by coruja.dcc.unicamp.br (8.8.5/8.8.5) id OAA07256;
	Tue, 18 Jan 2000 14:44:02 -0200 (EDT)
Date: Tue, 18 Jan 2000 14:44:02 -0200 (EDT)
Message-Id: <200001181644.OAA07256@coruja.dcc.unicamp.br>
From: Jorge Stolfi <stolfi@dcc.unicamp.br>
To: voynich@rand.org
Subject: Re: doaro
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
Content-Type: text/plain; charset=iso-8859-1
In-Reply-To: <C7397E3F@is-fs13.bham.ac.uk>
References: <C7397E3F@is-fs13.bham.ac.uk>
Reply-To: stolfi@dcc.unicamp.br
Sender: jim@mail.rand.org
Status: O


    > [Gabriel:] I cracked the vms and it is written in Portuguese. 
    > In folio 68r3, the 7 stars (Pleiades?)  from the constellation of 
    > Taurus reads "doaro" which is corresponds to the character 
    > substitution of "touro" (Is that correct Jorge?).
    > That's it. Now you can read the rest. :-)

Well, the Portuguese word is indeed "touro", but I'm afraid that it is
a red herring. Considering that the "d" looks pretty much like the "s"
of Rene's 15th century German alphabet, it is more likely that EVA
"doaro" is to be read "subaru", which is the name of the Pleiades in
Japanese.

Actually, we both may be right. The spelling "abiril" for April is
strong evidence that the author was Japanese (note the extra "i") and
learned the Western month names in Portugal or Spain (thus "b" instead
of "p"). Now, as you know, "ba" in Japanese kana is written "ha" with
the "ten-ten" accent mark, and "h" is silent in Portuguese.
Obviously he must have concluded that Europeans don't care for "b"s, and
thus he omitted that letter from his new "European style" alphabet.

Thus we get 

   EVA   Roman
    d  =   s
    o  =   u
    a  =   a
    r  =   r

I should point out also that "o" is often pronounced "u" in
Portuguese. Case closed.

BTW, that pretty much identifies the author as one of the three
Japanese converts who toured Europe (chiefly Lisbon and Rome) from
1582 to 1590.[1] Presumably the pilgrims left the VMS at Rome, and
Baresch got it there somehow around 1605, when he was at La Sapienza
"sapientiae operam daturus".[2]

--stolfi 8-)

[1] http://www.unigre.urbe.it/vallejo/Gennaio.html
    (Search for "Giappone")
    "... Eorumque adventum ex alio orbe terrarum non modo Roma et Italia, 
    verum etiam Lusitania et Hispania summis celebraverunt studiis. "
    That text mentions a 300+ report of the pilgrim's 
    trip, edited by Eduardus (Duarte) de Sande SJ, and 
    printed in 1590 by the Macao mission's press.

[2] http://www.voynich.nu/letters.html

From jim@mail.rand.org  Wed Jan 19 17:17:54 2000
Return-Path: <jim@mail.rand.org>
Received: from mail-blue.research.att.com (mail-blue.research.att.com [135.207.30.102])
	by fry.research.att.com (980427.SGI.8.8.8/8.8.7) with ESMTP id RAA71239
	for <reeds@fry.research.att.com>; Wed, 19 Jan 2000 17:17:54 -0500 (EST)
Received: by mail-blue.research.att.com (Postfix)
	id 9B2F84CE23; Wed, 19 Jan 2000 17:17:54 -0500 (EST)
Delivered-To: reeds@research.att.com
Received: from mail02-lax.pilot.net (mail-lax-2.pilot.net [205.139.40.16])
	by mail-blue.research.att.com (Postfix) with ESMTP id D8EEC4CE1F
	for <reeds@research.att.com>; Wed, 19 Jan 2000 17:17:53 -0500 (EST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail02-lax.pilot.net with ESMTP id OAA20188; Wed, 19 Jan 2000 14:17:46 -0800 (PST)
Received: from mail.rand.org (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id OAA24326; Wed, 19 Jan 2000 14:17:42 -0800 (PST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail.rand.org (8.9.3/8.9.3) with ESMTP id OAA28193 for <voynich@rand.org>; Wed, 19 Jan 2000 14:16:50 -0800 (PST)
Received: from mail01-lax.pilot.net (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id OAA24222 for <voynich@rand.org>; Wed, 19 Jan 2000 14:16:49 -0800 (PST)
Received: from mail.nctimes.net (mail.nctimes.net [208.239.16.66]) by mail01-lax.pilot.net with ESMTP id OAA12088 for <voynich@rand.org>; Wed, 19 Jan 2000 14:16:48 -0800 (PST)
Received: from nctimes.net ([208.239.20.126]) by mail.nctimes.net
          (Netscape Messaging Server 3.5)  with ESMTP id AAA3A23;
          Wed, 19 Jan 2000 14:13:13 -0800
Message-ID: <3886379D.4D1B8525@nctimes.net>
Date: Wed, 19 Jan 2000 14:15:58 -0800
From: Mark Perakh <perakh@nctimes.net>
Reply-To: perakh@nctimes.net
Organization: home
X-Mailer: Mozilla 4.7 [en] (Win98; I)
X-Accept-Language: en,ru
MIME-Version: 1.0
To: Rene <Zandbergen@t-online.de>
Cc: voynich@rand.org
Subject: Re: A few LSC comments
References: <12AJxt-0OiTYmC@fwd03.sul.t-online.de>
Content-Type: text/plain; charset=koi8-r
Content-Transfer-Encoding: 8bit
Sender: jim@mail.rand.org
Status: OR



Rene, I have had a few thoughts in regard to the monkey texts. You have
slightly modified our formula for Se by assuming that the distribution of
text elements (letters, digrams, trigrams, n-grams) is with replacement.
Let me say something about this assumption. I believe we have to
distinguish between four situations, to wit:
1) Texts generated by permutations of the above elements (as it was the
case in our study). In this case there is a limited stock of the above
elements, hence there is a negative correlation between elements
distributions in chunks, and therefore it is a case without replacement
(hypergeometeric distribution).  Our formula for Se was derived for that
situation.
2) Monkey texts generated by using the probabilities of elements (letters,
digraphs, etc) and also assuming that the stock of those elements is the
same as that available for the original meaningful text.  In this case we
have again negative correlation and it is a no-replacement case
(hypergeometric) so our formula is to be used without a modification.
3) The text generated as in item 2) but assuming the stock of letters is
much-much larger (say 100,000 times larger) than that available in the
original text, preserving though the ratios of elements occurrences as in
the original text.  This is a case with replacement (approximately but with
increasing accuracy as the size of the stock increases). In this case our
formula has to be modified (as indicated in paper 1) using multinomial
variance.  Quantitatively the difference is only in L/(L-1) coefficient
which at L>>1 is negligible.
4) The text generated assuming the stock of elements is unfinitely large.
In this case the distribution of elements is uniform, i.e. the
probabilities of all elements become equal to each other (each equal 1/z
where z is the number of all possible elements (letters, or digrams, etc)
in the original text). In this case formula for Se simplifies (I derived it
in paper 1 for that case as an approximation to roughly estimate Se for
n>1).  Quantitatively cases 1 through 3 are very close, but case 4 produces
quantities measurably (but not very much) differing from cases 1 through 3
(see examples in paper 1).
All of the above has only purely academic interest, but sometimes I am a
stickler, for the sake of some abstract accuracy.  Practically it is
inconsequential unless very short texts are used.  Cheers, Mark


From jim@mail.rand.org  Thu Jan 20 08:25:34 2000
Return-Path: <jim@mail.rand.org>
Received: from mail-blue.research.att.com (mail-blue.research.att.com [135.207.30.102])
	by fry.research.att.com (980427.SGI.8.8.8/8.8.7) with ESMTP id IAA94665
	for <reeds@fry.research.att.com>; Thu, 20 Jan 2000 08:25:34 -0500 (EST)
Received: by mail-blue.research.att.com (Postfix)
	id 34C334CE2C; Thu, 20 Jan 2000 08:25:34 -0500 (EST)
Delivered-To: reeds@research.att.com
Received: from mail01-lax.pilot.net (mail-lax-1.pilot.net [205.139.40.18])
	by mail-blue.research.att.com (Postfix) with ESMTP id B15074CE27
	for <reeds@research.att.com>; Thu, 20 Jan 2000 08:25:33 -0500 (EST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail01-lax.pilot.net with ESMTP id FAA29472; Thu, 20 Jan 2000 05:25:25 -0800 (PST)
Received: from mail.rand.org (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id FAA25354; Thu, 20 Jan 2000 05:25:24 -0800 (PST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail.rand.org (8.9.3/8.9.3) with ESMTP id FAA06832 for <voynich@rand.org>; Thu, 20 Jan 2000 05:25:11 -0800 (PST)
From: rzandber@esoc.esa.de
Received: from mail03-lax.pilot.net (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id FAA25333 for <voynich@rand.org>; Thu, 20 Jan 2000 05:25:10 -0800 (PST)
Received: from esacom43.esoc.esa.de (esacom43.esoc.esa.de [131.176.86.4]) by mail03-lax.pilot.net with ESMTP id FAA13189 for <voynich@rand.org>; Thu, 20 Jan 2000 05:25:09 -0800 (PST)
Received: from esacom53.esoc.esa.de (esacom53.esoc.esa.de [131.176.85.6])
	by esacom43.esoc.esa.de (8.9.2/8.9.2/ESA-ESOC-v1.8) with ESMTP id NAA11554
	for <voynich@rand.org>; Thu, 20 Jan 2000 13:06:00 GMT
Received: from esocmail1.esoc.esa.de (esocmail3.dev.esoc.esa.de [131.176.51.30])
	by esacom53.esoc.esa.de (8.9.2/8.9.2/ESA-ESOC-mail-gw-v1.5) with SMTP id NAA12651
	for <voynich@rand.org>; Thu, 20 Jan 2000 13:17:29 GMT
Received: by esocmail1.esoc.esa.de(Lotus SMTP MTA v4.6.3  (733.2 10-16-1998))  id 4125686C.00496B1D ; Thu, 20 Jan 2000 14:21:55 +0100
X-Lotus-FromDomain: ESA
To: voynich@rand.org
Message-ID: <4125686C.00496971.00@esocmail1.esoc.esa.de>
Date: Thu, 20 Jan 2000 14:22:03 +0100
Subject: Gif...
Mime-Version: 1.0
Content-type: text/plain; charset=us-ascii
Content-Disposition: inline
Sender: jim@mail.rand.org
Status: OR

Dear all,

May I recommend an image from an Indian cook book (Madhur Jaffrey: A taste of
India -
the definitive guide to regional cooking).

http://www.voynich.nu/indian.gif

Remind anyone of anything? :-)

Cheers, Rene

PS: Gabriel: my E-mail is bouncing. Let me know if you receive this.


From jim@mail.rand.org  Fri Jan 21 22:34:14 2000
Return-Path: <jim@mail.rand.org>
Received: from mail-blue.research.att.com (mail-blue.research.att.com [135.207.30.102])
	by fry.research.att.com (980427.SGI.8.8.8/8.8.7) with ESMTP id WAA69116
	for <reeds@fry.research.att.com>; Fri, 21 Jan 2000 22:34:14 -0500 (EST)
Received: by mail-blue.research.att.com (Postfix)
	id 0F6124CE09; Fri, 21 Jan 2000 22:34:14 -0500 (EST)
Delivered-To: reeds@research.att.com
Received: from mail01-lax.pilot.net (mail-lax-1.pilot.net [205.139.40.18])
	by mail-blue.research.att.com (Postfix) with ESMTP id 657294CE05
	for <reeds@research.att.com>; Fri, 21 Jan 2000 22:34:13 -0500 (EST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail01-lax.pilot.net with ESMTP id TAA08774; Fri, 21 Jan 2000 19:34:05 -0800 (PST)
Received: from mail.rand.org (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id TAA27771; Fri, 21 Jan 2000 19:34:04 -0800 (PST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail.rand.org (8.9.3/8.9.3) with ESMTP id TAA08626 for <voynich@rand.org>; Fri, 21 Jan 2000 19:33:36 -0800 (PST)
Received: from mail03-lax.pilot.net (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id TAA27739 for <voynich@rand.org>; Fri, 21 Jan 2000 19:33:34 -0800 (PST)
Received: from grande.dcc.unicamp.br (grande.dcc.unicamp.br [143.106.7.8]) by mail03-lax.pilot.net with ESMTP id RAA13502 for <voynich@rand.org>; Fri, 21 Jan 2000 17:53:16 -0800 (PST)
Received: from amazonas.dcc.unicamp.br (amazonas.dcc.unicamp.br [143.106.7.11])
	by grande.dcc.unicamp.br (8.9.3/8.9.3) with ESMTP id XAA05710;
	Fri, 21 Jan 2000 23:52:10 -0200 (EDT)
Received: from coruja.dcc.unicamp.br (coruja.dcc.unicamp.br [143.106.24.80])
	by amazonas.dcc.unicamp.br (8.8.5/8.8.5) with ESMTP id XAA17407;
	Fri, 21 Jan 2000 23:52:07 -0200 (EDT)
Received: (from stolfi@localhost)
	by coruja.dcc.unicamp.br (8.8.5/8.8.5) id XAA02392;
	Fri, 21 Jan 2000 23:52:07 -0200 (EDT)
Date: Fri, 21 Jan 2000 23:52:07 -0200 (EDT)
Message-Id: <200001220152.XAA02392@coruja.dcc.unicamp.br>
From: Jorge Stolfi <stolfi@dcc.unicamp.br>
To: perakh@nctimes.net
Cc: voynich@rand.org
Subject: Reinterpreting the LSC (long)
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
Content-Type: text/plain; charset=iso-8859-1
In-Reply-To: <3886379D.4D1B8525@nctimes.net>
References: <12AJxt-0OiTYmC@fwd03.sul.t-online.de>
	<3886379D.4D1B8525@nctimes.net>
Reply-To: stolfi@dcc.unicamp.br
Sender: jim@mail.rand.org
Status: OR


Mark, I haven't had the time to read your papers carefully enough,
but the idea of the LSC seems quite interesting. Here are some of my
comments.

Why should the LSC work?

  In a very broad sense, the LSC and the nth-order character/word
  entropies are trying to measure the same thing, namely the
  correlation between letters that are a fixed distance apart.

  People have observed before that correlation between samples n steps
  apart tends to be higher for "meaningful" signals than for "random"
  ones, even for large n. The phenomenon has been observed in music,
  images, DNA sequences, etc. This knowledge has been useful for, among
  other things, designing good compression and approximation methods for
  such signals. Some of the buzzwords one meets in that context are
  "fractal", "1/f noise", "wavelet", "multiscale energy", etc. (I
  believe that Gabriel has written papers on fractals in the context of
  medical imaging. And a student of mine just finished her thesis on
  reassembling pottery fragments by matching their outlines, which turn
  out to be "fractal" too.) 
  
  As I try to show below, one can understand the LSC as decomposing
  the text into various frequency bands, and measuring the `power'
  contained in each band. If we do that to a random signal, we will
  find that each component frequency has roughly constant expected
  power; i.e. the power spectrum is flat, like that of ideal white
  light (hence the nickname `white noise'.)  On the other hand, a
  `meaningful' signal (like music or speech) will be `lumpier' than a
  random one, at all scales; so its power spectrum will show an excess
  power at lower frequencies. It is claimed that, in such signals, the
  power tends to be inversely proportional to the frequency; hence the
  moniker `1/f noise'.  
  
  If we lump the spectrum components into frequency bands, we will
  find that the total power contained in the band of frequencies
  between f and 2f will be proportional to f for a random signal, but
  roughly constant for a `meaningful' signal whose spectrum indeed
  follows the 1/f profile.
  
Is the LSC better than nth-order entropy?

  In theory, the nth-order entropies are more powerful indicators of
  structure. Roughly speaking, *any* regular structure in the text
  will show up in some nth-order entropy; whereas I suspect that one can
  construct signals that have strong structure (hence low entropy)
  but the same LSC as a purely random text.

  However, the formula for nth-order entropy requires one to estimate
  z**n probabilities, where z is the size of the alphabet. To do that
  reliably, one needs a corpus whose length is many times z**n. So the
  entropies are not very meaningful for n beyond 3 or so.

  The nth-order LSC seems to be numerically more stable, because it maps
  blocks of n consecutive letters into a single `super-letter' which
  is actually a vector of z integers; and compares these super-letters
  as vectors (with difference-squared metric) rather than symbols
  (with simple 0-1 metric). I haven't done the math --- perhaps you
  have --- but it seems that computing the n-th order LSC to a fixed
  accuracy requires a corpus whose length L is proportional to z*n (or
  perhaps z*n**2?) instead of z**n.
  
  Morever, one kind of structure that the LSC *can* detect is any
  medium- and long-range variation in word usage frequency along the
  text. (In fact, the LSC seems to have been designed specifically for
  that purpose.) As observed above, such variations are present in
  most natural languages, but absent in random texts, even those
  generated by kth-order monkeys.
  
  Specifically, if we take the the output of a k-th order `letter
  monkey' and break it into chunks whose length n >> k, we will find
  that the number of times a given letter occurs in each chunk is
  fairly constant (except for sampling error) among all chunks. For
  kth-order `word monkeys' we should have the same result as long as 
  n >> k*w, where w is the average word length. On the other hand, a
  natural-language text will show variations in letter frequencies,
  which are due to changes of topic and hence vocabulary changes, that
  extend for whole paragraphs or chapters.
  
  Thus, although the LSC may not be powerful enough to detect the
  underlying structure in non-trivial ciphers, it seems well suited at
  distinguishing natural language from monkey-style random text.
  
Another view of LSC

  Someone -- I think it was Goethe -- observed that mathematicians
  are like Frenchmen: whatever you say to them, they immediately translate
  into their language, and it then becomes something else entirely.
  Although I am not a mathematician, I cannot resist the chance
  to translate your LSC definition into my own brand of `French.'
  Here it goes:
  
  STEP 1 of the LSC computation consists in replacing each letter Y by
    a vector of z zeros and ones, where the r-th component of the
    vector is 1 if and only if Y is the r-th letter of the alphabet.
    Thus the original text is replaced by z binary strings of the same
    length.  Steps 2 and 3 below are applied to each
    of these z binary strings, separately.

  STEP 2 takes a binary string y(i), i = 0..L-1 and produces a string
    of integers Y^n_j, j = 0..L/n-2, by the formula
    
      Y^n_j = \sum_{i=0}^{L-1} h^n_j(i) y(i),               (1)
      
    where h^n_j() is the `kernel function' 

                 { +1 for jk \leq i < (j+1)n,
      h^n_j(i) = { -1 for (j+1)n \leq i < (j+2)n,           (0)
                 { 0 for all other values of i.
    
    In words,  h^n_j() is a train of n `+1's followed
    by n `-1's, all surrounded by a sea of zeros.
    (Here "^n" is merely a superscript index, not a power.)

  STEP 3 takes the numbers Y^n_j and computes the sums

      S^n = \sum_{j=1}^{L/n-2} Y^n_j**2                     (2)
      
  STEP 4 adds together the sums S^n_x obtained for 
    different letters x, into a single sum S^n_tot.

LSC as multiscale signal analysis

  The point of using this convoluted formalism is that formula (1) can
  be interpreted as a `scalar product' between the `signal' y() and
  the `kernel' h^n_j(). Except for a constant factor, the coefficients
  Y^n_j is therefore a measure of `how much' of h^n_j() is contained
  in the signal y().
  
  This interpretation of formula (1) is not restricted to the
  particular kernel functions h^n_j of definition (0), but can be used
  with any set of kernels h_r(); in that case, we would write
  
      Y_r = \sum_{i=0}^{L-1}  h_r(i) y(i)                   (1')
  
  This interpretation becomes more interesting when the kernel
  funtions h_r() are orthonormal, i.e. \sum_i h_r(i) h_s(i) = 0 for r
  different from s, and \sum_i h_r(i)**2 = 1. Then the signal
  
      y~(i) = \sum_r  Y_r  h_r(i)  

  is the combination of kernel functions that best approximates the
  signal y(). In particular, if every signal y() is a linear
  combination of kernel functions (i.e. the kernels form a basis of
  signal space), the numbers Y_r are the coefficients of that
  combination.
  
  The decomposition of a signal y() into a combination of certain
  kernel functions is a basic tool of signal processing. For instance,
  in Fourier theory the kernels h_r are sinusoids and co-sinusoids on
  the parameter i with various frequencies, and the numbers Y_r are
  the corresponding Fourier coefficients.
  
  The Fourier kernels have many nice properties, but they extend over
  the whole i range; thus changing a single sample y(i) will change
  all the coefficients Y_r. For that reason, in some applications
  people prefer to use kernels that have a bounded support --- i.e.
  they are zero, or practically zero, outside of a small range of the
  i parameter. A set of such kernels is called a {\em spline basis}.

  However, if Fourier kernels suffer for being too `global', spline
  kernels suffer for being too `local': in order to approximate a
  broad hump y() one nees to add many narrow bumps. {\em Wavelets}
  were invented to fill the gap between these two extremes.
  
  Note that all the functions h^n_j(j) defined by formula (0) are just
  like the function h^1_0(), only stretched by a factor of n and shifted
  by n*j. This scale-and-shift property is basically the definition of a
  `wavelet family'. The parameter n is called the `scale' of the
  wavelet, and the index j could be called its `phase'.

  Wavelet theory is based on the observation that, if one picks a `mother
  wavelet' h^1_0() with the right shape, then the following wavelets are
  mutually orthogonal and span the space of all signals:

      h^1_0, h^1_1, h^1_2, h^1_3, h^1_4, h^1_5, ...,
      h^2_0, h^2_1, h^2_2, h^2_3, h^2_4, h^2_5, ...,
      h^4_0, h^4_1, h^4_2, h^4_3, h^4_4, h^4_5, ...,     (3)
      h^8_0, h^8_1, h^8_2, h^8_3, h^8_4, h^8_5, ...,

  Note that the scale n goes up in geometric progression, while the
  indices j are consecutive. The wavelets with intermediate scales --
  such as h^3_0 -- are redundant, in the sense that the corresponding
  coefficients Y^n_j can be computed from the coefficients of the
  wavelets listed above. Thus, for a signal of length L, we need only
  L/(w*n) wavelets of scale n, where w is the actual shift between
  h^1_0 and h^1_1. Therefore, the number of terms in each `layer' goes
  down geometrically, and there are only log(L) layers.
  
  The decomposition of a signal into a wavelet basis (3) is still
  fairly `local', in the sense that a change in one sample y(i)
  affects only O(log(L)) coefficients Y^n_j, corresponding to wavelets
  h^n_j() whose support includes the point i. On the other hand, a
  broad bump in the signal y() usually gets represented by a couple of
  large coefficients Y^n_j, corresponding to wavelets of roughly the
  same width and position as the bump, plus smaller `adjustment'
  coefficients with smaller scales.

  If we decompose a signal y() into non-redundant wavelets, and then
  take only the components with a fixed scale n, we are basically
  looking at the part of the signal that consists of details of size
  ~w*n. This is almost the same as decomposing y() into sinusoids of
  frequencies 1,2,3,..., and then adding only the components with
  frequencies between n and 2n.

  So a byproduct of wavelet analysis is a decomposition of the signal
  y() into log(L) parts or `bands,' one for each layer of the basis.
  If we add these components from the bottom up, after each stage we
  obtain a picture of y() that is twice as sharp as the previous one. In
  that case, the quantity S^n of formula (2) is the `strength' (or
  `power' in the usual jargon) of band n of the signal y().
  
  The kernels of definition (0), which are implied by the original LSC
  definition, are NOT orthogonal, but only because they are packed
  too tightly.  If we redefine them as

                 { +1 for 2jn \leq i < 2(j+1)n,
      h^n_j(i) = { -1 for 2(j+1)n \leq i < 2(j+2)n,         (0')
                 { 0 for all other values of i.
  
  (with successive wavelets spaced by 2n rather than n), then all
  members of family (3) become pairwise orthogonal, and (except for
  scale factors) we get the so-called `Haar wavelet basis'.
  
  The bottom-most layer of the Haar basis has only one half of a
  wavelet, whose full width is 2*L; the corresponding coefficient is
  proportional to the average of all samples y(i). The next layer has
  a single wavelet with period L; its coefficient gives the difference
  between the average value of y() in the half-range [0..L/2-1] and
  the average value in [L/2..L-1]. The next layer has two wavelets;
  their coefficients compare the average value of y() between the
  first two quarters of the range [0..L-1], and between the last two
  quarters. An so on.
  
  When we add all layers of the Haar decomposition with scales >= n,
  we get a staircase approximation to the signal y(), with steps of
  width n. Each wavelet in the next layer will split one of these
  steps in two steps of width n/2, raising one side and lowering the
  other by the same amount.
  
Variations on the LSC

  This reinterpretation of the LSC exposes several choices that
  were implicit in the original definition.  We can get many
  other `LSC-like' indicators by picking different alternatives for
  those choices.  
  
  In STEP 1 the given text gets converted into one or more numeric
    `signals' y(i). This step is perhaps the most arbitrary part of the
    definition. Almost any mapping here will do, as long as it is
    local --- i.e. changing one letter of the text will affect only
    one signal sample y(i) (or at most a few adjacent samples).
    
    In the original definition, each letter is mapped to 1 or 0
    depending on whether it is equal to or different from a given
    letter x. This simple mapping has the advantage that it makes the
    LSC invariant under simple letter substitutions. It may also be
    the `most sensitive' mapping under the appropriate definition of
    sensitivity. But we could consider other mappings, especially
    if we already know something about the alphabet.  For instance,
    we could map consonants to +1, vowels to -1; or map 
    each letter to a weight that depends on its frequency over the
    whole text.
    
    Note that it is fairly immaterial how we treat spaces,
    punctuation, case, diacritics, digraphs, etc. We can treat those
    features as letters, delete them, lump them with adjacent letters,
    whatever. These choices will normally preserve the long-range
    correlations (although they may affect the exact value of those
    correlations).
    
  In STEP 2 we analyze the signal y() against some repertoire
    of kernel functions h^n_j, and obtain the corresponding
    coefficients Y^n_j. 
    
    One conclusion we get from the `multiscale analysis'
    interpretation is that it is pointless to compute the LSC for all
    scales n between 1 and L. It is sufficient to consider only those
    scales n which are powers of two. This observation allows us to
    cut down the cost of computing the complete LSC plot from O(L**2)
    to O(L) --- a 10^5 speedup for typical texts!
    
    The multiscale analysis interpretation also suggests that we could
    use other kernel functions in lieu of formula (0). From that
    viewpoint, we can see that the shape of the kernels is not
    important, as long as the application of formula (1) has the
    effect of decomposing the signal y() into frequency bands related
    to the scale n. It is not even necessary for the kernels to have
    compact support. We could use for instance the Fourier kernel
    (sinusoids), numbered so that the kernels with a given scale n
    have frequencies between L/n and L/2n.
    
    As observed above, the kernels implied by the original LSC
    definition (formula (0)) are redundant when all scales are taken
    together, and are not orthogonal. However, they are not redundant
    if we consider a single scale n. In fact, the kernels { h^n_j,
    j=0..L/n-2 } are linearly independent, and just one degree of
    freedom short of being a complete basis; so they do provide good
    coverage.
    
    Another problem with the original kernels is that they are not
    orthogonal. This would be an inconvenient for other applications,
    but it does nor seem to be a problem if th eonly goal is to
    distinguish meaningful signals from random noise.
    
    A more serious defect is that the kernels are square-shaped with
    sharp discontinuities. Because of such discontinuities, each
    `layer' of the implied signal decomposition will cover a rather
    broad band of frequencies, and there will be significant overlap
    between bands. Unfortunately, I don't know how to explain this
    point in few words. 
    
    Anyway, the conclusion is that the use discontinuous kernels may
    result in `blurred' LSC-versus-n plots, with spurious peaks and
    valleys known as `Gibbs effects' or `ringing.' It may be worth
    trying smoother kernels, such as gaussians and their derivatives,
    or gaussian-sinusoid products.
    
    In fact, you should consider using smoothest possible kernels,
    namely the Fourier basis.  In other words, instead of the 
    LSC as defined, you could compute the power spectra of the 
    signals y(). 
    
  STEP 3 computes the sum S^n of the squares of the coefficients
    Y^n_j, for all j. I can't think of any useful alternative to this
    step. However, under the `signal analysis' interpretation, it is
    desirable to choose the kernel functions so that they are
    orthogonal and have unit norm (i.e. \sum_i h^n_j(i)**2 = 1). These
    conditions ensure that the power of the original signal y() is
    equal to the sum of all S^n.  That will make the value of S^n
    more meaningful, and easier to estimate for random texts.

Other comments

  In conclusion, my understanding of the Perakh-McKay papers is that
  computing the LSC is an indirect way of computing the power spectrum
  of the text. The reason why the LSC distinguishes meaningful texts
  from monkey gibberish is that the former have variations in letter
  frequencies at all scales, and hence a 1/f-like power spectrum;
  whereas the latter have uniform letter frequencies, at least over
  scales of a dozen letters, and therefore have a flat power spectrum.
  
  Looking at the LSC in the context of multiscale analysis suggests
  many possible improvements, such as using scales in geometric
  progression, and kernels which are smoother, orthogonal, and
  unitary.  Even if these changes do not make the LSC more sensitive,
  they should make the results easier to evaluate.
  
  In retrospect, it is not surprising that the LSC can distinguish the
  original Genesis from a line-permuted version: the spectra should be
  fairly similar at high frequencies (with periods shorter than one
  line), but at low frequencies the second text should have an
  essentially flat spectrum, like that of a random signal. The same
  can be said about monkey-generated texts.
  
  On the otherhand, I don't expect the LSC to be more effetive than
  simple letter/digraph frequency analysis when it comes to
  identifying the language of a text. The most significant influence
  in the LSC is the letter frequency histogram --- which is sensitive
  to topic (e.g. "-ed" is common when talking about past) and to
  spelling rules (e.g. whether one writes "ue" or ""). The shape of
  the LSC (or Fourier) spectrum at high frequencies (small n) must be
  determined mainly by these factors. The shape of the specrtum at
  lower frequencies (higher n) should be determined chiefly by topic
  and style.

  I have only skimmed through the sections about estimating the LSC
  for random texts. I didn't see anything wrong, but my impression is
  that those sections are overdoing it. For instance, it should not
  matter whether the letters are drawn with or without replacements,
  or whether the last chunk is complete or not: surely the effect of
  these details on the LSC must be less than the sampling error.
  (Compare, for instance, the LSC of the first half of Genesis with
  that of the second half.) Thus, I believe that these sections could
  be much shorter if these details were dismissed right at the
  beginning.
  
All the best,

--stolfi

From jim@mail.rand.org  Sat Jan 22 11:01:35 2000
Return-Path: <jim@mail.rand.org>
Received: from mail-green.research.att.com (mail-green.research.att.com [135.207.30.103])
	by fry.research.att.com (980427.SGI.8.8.8/8.8.7) with ESMTP id LAA93905
	for <reeds@fry.research.att.com>; Sat, 22 Jan 2000 11:01:35 -0500 (EST)
Received: by mail-green.research.att.com (Postfix)
	id DE58D1E016; Sat, 22 Jan 2000 11:01:34 -0500 (EST)
Delivered-To: reeds@research.att.com
Received: from mail02-lax.pilot.net (mail-lax-2.pilot.net [205.139.40.16])
	by mail-green.research.att.com (Postfix) with ESMTP id 5EABB1E00B
	for <reeds@research.att.com>; Sat, 22 Jan 2000 11:01:34 -0500 (EST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail02-lax.pilot.net with ESMTP id IAA15915; Sat, 22 Jan 2000 08:01:30 -0800 (PST)
Received: from mail.rand.org (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id IAA09917; Sat, 22 Jan 2000 08:01:29 -0800 (PST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail.rand.org (8.9.3/8.9.3) with ESMTP id IAA22370 for <voynich@rand.org>; Sat, 22 Jan 2000 08:01:15 -0800 (PST)
Received: from mail02-lax.pilot.net (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id IAA09873 for <voynich@rand.org>; Sat, 22 Jan 2000 08:01:14 -0800 (PST)
Received: from grande.dcc.unicamp.br (grande.dcc.unicamp.br [143.106.7.8]) by mail02-lax.pilot.net with ESMTP id HAA07894 for <voynich@rand.org>; Sat, 22 Jan 2000 07:46:47 -0800 (PST)
Received: from amazonas.dcc.unicamp.br (amazonas.dcc.unicamp.br [143.106.7.11])
	by grande.dcc.unicamp.br (8.9.3/8.9.3) with ESMTP id KAA08798
	for <voynich@rand.org>; Sat, 22 Jan 2000 10:29:01 -0200 (EDT)
Received: from coruja.dcc.unicamp.br (coruja.dcc.unicamp.br [143.106.24.80])
	by amazonas.dcc.unicamp.br (8.8.5/8.8.5) with ESMTP id KAA27780
	for <voynich@rand.org>; Sat, 22 Jan 2000 10:27:09 -0200 (EDT)
Received: (from stolfi@localhost)
	by coruja.dcc.unicamp.br (8.8.5/8.8.5) id KAA02811;
	Sat, 22 Jan 2000 10:27:09 -0200 (EDT)
Date: Sat, 22 Jan 2000 10:27:09 -0200 (EDT)
Message-Id: <200001221227.KAA02811@coruja.dcc.unicamp.br>
From: Jorge Stolfi <stolfi@dcc.unicamp.br>
To: voynich@rand.org
Subject: LSC and the VMS
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
Content-Type: text/plain; charset=iso-8859-1
Reply-To: stolfi@dcc.unicamp.br
Sender: jim@mail.rand.org
Status: O


Mark's LSC tests applied to the VMS give results typical of natural
languages, and quite different from those of monkey text.

This is very good news, at least for those of us who still believe
that there is a text in there to be read. As for myself, I have
remarked several times in the past that the distribution of words in
the VMS seemed to be far from uniform; it is nice to see that vague
feeling turned into a quantitative measurement.

Unfortunately, even this powerful test still leves some room
for doubt.  

For one thing, while the LSC can unmask ordinary monkeys, it too can
be fooled with relative ease, once one realizes how it works. One
needs only to build a `multiscale monkey' that varies the frequencies
of the letters along the text, in a fractal-like manner.

Of course, it is hard to imagine a medieval forger being aware of
fractal processes. However, he could have used such a process without
knowing it. For instance, he may have copied an arabic book, using
some fancy mapping of arabic letters to Voynichese alphabet. The
mapping would not have to be invertible, or consistently applied: as
long as the forger mantained some connection between the original text
and the transcript, the long-range frequency variations of the former
would show up in the latter as well.

Moreover, I suspect that any nonsense text that is generated `by hand'
(i.e. without the help of dice or other mechanical devices) will
show long-range variations in letter frequencies at least as 
strong as those seen in meaningful texts.

Thus Mark's results do not immediately rule out random but
non-mechanical babble or glossolalia. However, it is conceivable that
such texts will show *too much* long-range variation, instead of too
little. We really need some samples...

All the best,

--stolfi


From jim@mail.rand.org  Sat Jan 22 07:46:53 2000
Return-Path: <jim@mail.rand.org>
Received: from mail-blue.research.att.com (mail-blue.research.att.com [135.207.30.102])
	by fry.research.att.com (980427.SGI.8.8.8/8.8.7) with ESMTP id HAA94933
	for <reeds@fry.research.att.com>; Sat, 22 Jan 2000 07:46:53 -0500 (EST)
Received: by mail-blue.research.att.com (Postfix)
	id 0A0274CE2A; Sat, 22 Jan 2000 07:46:53 -0500 (EST)
Delivered-To: reeds@research.att.com
Received: from mail02-lax.pilot.net (mail-lax-2.pilot.net [205.139.40.16])
	by mail-blue.research.att.com (Postfix) with ESMTP id 7456C4CE08
	for <reeds@research.att.com>; Sat, 22 Jan 2000 07:46:52 -0500 (EST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail02-lax.pilot.net with ESMTP id EAA06148; Sat, 22 Jan 2000 04:46:48 -0800 (PST)
Received: from mail.rand.org (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id EAA07210; Sat, 22 Jan 2000 04:46:47 -0800 (PST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail.rand.org (8.9.3/8.9.3) with ESMTP id EAA19448 for <voynich@rand.org>; Sat, 22 Jan 2000 04:46:27 -0800 (PST)
Received: from mail03-lax.pilot.net (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id EAA07194 for <voynich@rand.org>; Sat, 22 Jan 2000 04:46:27 -0800 (PST)
Received: from mailer3.bham.ac.uk (mailer3.bham.ac.uk [147.188.128.54]) by mail03-lax.pilot.net with ESMTP id EAA03960 for <voynich@rand.org>; Sat, 22 Jan 2000 04:46:26 -0800 (PST)
Received: from bham.ac.uk ([147.188.128.127])
	by mailer3.bham.ac.uk with esmtp (Exim 3.02 #16)
	id 12Bzvn-0004ly-00
	for voynich@rand.org; Sat, 22 Jan 2000 12:46:19 +0000
Received: from is-fs13.bham.ac.uk ([147.188.128.55])
	by bham.ac.uk with esmtp (Exim 3.10 #1)
	id 12Bzvn-00001s-00
	for voynich@rand.org; Sat, 22 Jan 2000 12:46:19 +0000
Received: from BHAM-IS-FS13/SpoolDir by is-fs13.bham.ac.uk (Mercury 1.44);
    22 Jan 00 12:46:19 +0000
Received: from SpoolDir by BHAM-IS-FS13 (Mercury 1.44); 22 Jan 00 12:46:07 +0000
Received: from oemcomputer (147.188.135.2) by is-fs13.bham.ac.uk (Mercury 1.44) with ESMTP;
    22 Jan 00 12:46:04 +0000
From: "Gabriel Landini" <G.Landini@bham.ac.uk>
Organization: The University of Birmingham, U.K.
To: voynich@rand.org
Date: Sat, 22 Jan 2000 12:46:26 -0000
MIME-Version: 1.0
Content-type: text/enriched; charset=US-ASCII
Content-transfer-encoding: 7BIT
Subject: Re: Reinterpreting the LSC (long)
Reply-To: G.Landini@bham.ac.uk
In-reply-to: <200001220152.XAA02392@coruja.dcc.unicamp.br>
References: <3886379D.4D1B8525@nctimes.net>
X-mailer: Pegasus Mail for Win32 (v3.12b)
Message-ID: <2D76BE1740@is-fs13.bham.ac.uk>
Sender: jim@mail.rand.org
Status: OR

<color><param>0100,0100,0100</param>On 21 Jan 00, at 23:52, Jorge Stolfi wrote: 

<color><param>7F00,0000,0000</param>>   The phenomenon has been observed in music, images, DNA

> sequences, etc. This knowledge has been useful for, among other

> things, designing good compression and approximation methods for

> such signals. 


<color><param>0000,0000,0000</param>On the DNA arena there has been a never-ending debate between  
2 groups. I thought that it would be interesting to some (after all 
DNA is also a symbolic sequence that carries a messsage), what 
is this all about and whether one can do a similar thing with the 
vms. 


One group has done exactly what Jorge proposed: 


<color><param>7F00,0000,0000</param>>   STEP 1 of the LSC computation consists in replacing each 
letter Y by 

>     a vector of z zeros and ones, where the r-th component of the 

>     vector is 1 if and only if Y is the r-th letter of the alphabet. 


<color><param>0000,0000,0000</param>I think that this has been called a binary coding of DNA using a  
Heaviside function. 


for base A: 


<FontFamily><param>Courier New</param>ATCGAAGTACGC.... 

100011001000.... 


<FontFamily><param>ARIAL</param>and so on for the other bases. These are then submitted to a Fast  
Fourier Transform and the power spectrum is plotted as log(power)  
vs. log(frequency). Slopes around -1 are characteristic of 1/f noise,  
0 is white noise, -2 is Brown or brownian noise, -3 is black noise. 

Using this method for the 4 bases separated or the average  
spectrum of 4, there are long range correlation but at the very long  
range. There have been claims of different slopes in different  
species and this dependent on evolution. 


Another group used a more arbitrary method. As DNA's 4 bases  
are of 1 of 2 different types (purinic or pirimidinic) they construct a  
1-dimensional random walk based on whether the next base is of  
one type or another (thus going up or down). This walk is then  
submitted to something called R/S analysis in which the sequence 
 is divided in chunks, the increments in the sequence calculated  
and then a plot of log(segment size) vs log(range / standard  
deviation of the increments) (hence R/S). Slopes (Hurst exponent)  
of 0.5 are characteristic of brownian motion (which is the integral of 
white noise), larger than than make the sequence "persistent" and 
smaller than 0.5  make it "anti-persistent". 

The only sensible thing here would be to make the random walk  
embedded in the 4-base space, but apparently if you do that,  what 
they try to show does not always show up (!). Note that this  "base-
type" encoding is quite arbitrary because it is not the same  thing 
to switch the bases according to their type alone. This is like  
saying that one can re-code a language based on whether the  
letters have "roundy" bits (a,o,d,b,q,p) or not (i,t,y,x, etc..)  (so  
take your own conclusions).  


Anyway, that group claims that non-coding areas of DNA (the so- 
called junk DNA) have long range correlations, while the coding  
(genes) do not. The finding is interesting, but to date I do not think  
there are any clues about the meaning or relevance of this. Why? 
because that "random walk" does not fully carry the message of 
DNA.


Of course this can be applied to the VMS, but the only problem is  
that we know that there are a number of pages missing, so our  
sequence is not continuous. It may be interesting to try anyway. 


I've been about to do some of this since I joined the list. Perhaps it 
is time... 


cheers, 


Gabriel 



From jim@mail.rand.org  Sat Jan 22 11:39:50 2000
Return-Path: <jim@mail.rand.org>
Received: from mail-green.research.att.com (mail-green.research.att.com [135.207.30.103])
	by fry.research.att.com (980427.SGI.8.8.8/8.8.7) with ESMTP id LAA69224
	for <reeds@fry.research.att.com>; Sat, 22 Jan 2000 11:39:49 -0500 (EST)
Received: by mail-green.research.att.com (Postfix)
	id C07A91E016; Sat, 22 Jan 2000 11:39:49 -0500 (EST)
Delivered-To: reeds@research.att.com
Received: from mail01-lax.pilot.net (mail-lax-1.pilot.net [205.139.40.18])
	by mail-green.research.att.com (Postfix) with ESMTP id 246E81E00B
	for <reeds@research.att.com>; Sat, 22 Jan 2000 11:39:49 -0500 (EST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail01-lax.pilot.net with ESMTP id IAA15487; Sat, 22 Jan 2000 08:39:45 -0800 (PST)
Received: from mail.rand.org (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id IAA10660; Sat, 22 Jan 2000 08:39:45 -0800 (PST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail.rand.org (8.9.3/8.9.3) with ESMTP id IAA23347 for <voynich@rand.org>; Sat, 22 Jan 2000 08:39:39 -0800 (PST)
Received: from mail02-lax.pilot.net (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id IAA10647 for <voynich@rand.org>; Sat, 22 Jan 2000 08:39:39 -0800 (PST)
Received: from mail.nctimes.net (mail.nctimes.net [208.239.16.66]) by mail02-lax.pilot.net with ESMTP id IAA06427 for <voynich@rand.org>; Sat, 22 Jan 2000 08:39:38 -0800 (PST)
Received: from nctimes.net ([208.239.20.108]) by mail.nctimes.net
          (Netscape Messaging Server 3.5)  with ESMTP id AAA35DB;
          Sat, 22 Jan 2000 08:36:00 -0800
Message-ID: <3889DD13.20FE24CC@nctimes.net>
Date: Sat, 22 Jan 2000 08:38:43 -0800
From: Mark Perakh <perakh@nctimes.net>
Reply-To: perakh@nctimes.net
Organization: home
X-Mailer: Mozilla 4.7 [en] (Win98; I)
X-Accept-Language: en,ru
MIME-Version: 1.0
To: G.Landini@bham.ac.uk
Cc: voynich@rand.org
Subject: Re: Reinterpreting the LSC (long)
References: <3886379D.4D1B8525@nctimes.net> <2D76BE1740@is-fs13.bham.ac.uk>
Content-Type: text/plain; charset=koi8-r
Content-Transfer-Encoding: 7bit
Sender: jim@mail.rand.org
Status: OR

Gabriel and Jorge, many thanks for your interest in LSC. I forwarded Jorge's message to Brendan who agreed with me that Jorge's treatment makes sense and sheds a new light on LSC. Unfortunately both me and Brendan do not understand some parts of Jorge's discussion.  Brendan is very busy with other things and we both ceased to work with LSC some time ago, switching to different subjects.  We'll be happy if other people take over now and develop it in any directiion they feel proper. Best, Mark

Gabriel Landini wrote:

> On 21 Jan 00, at 23:52, Jorge Stolfi wrote:
> > The phenomenon has been observed in music, images, DNA
> > sequences, etc. This knowledge has been useful for, among other
> > things, designing good compression and approximation methods for
> > such signals.
>
> On the DNA arena there has been a never-ending debate between 2 groups. I thought that it would be interesting to some (after all DNA is also a symbolic sequence that carries a messsage), what is this all about and whether one can do a similar thing with the vms.
>
> One group has done exactly what Jorge proposed:
>
> > STEP 1 of the LSC computation consists in replacing each letter Y by
> > a vector of z zeros and ones, where the r-th component of the
> > vector is 1 if and only if Y is the r-th letter of the alphabet.
>
> I think that this has been called a binary coding of DNA using a Heaviside function.
>
> for base A:
>
> ATCGAAGTACGC....
> 100011001000....
>
> and so on for the other bases. These are then submitted to a Fast Fourier Transform and the power spectrum is plotted as log(power) vs. log(frequency). Slopes around -1 are characteristic of 1/f noise, 0 is white noise, -2 is Brown or brownian noise, -3 is black noise.
> Using this method for the 4 bases separated or the average spectrum of 4, there are long range correlation but at the very long range. There have been claims of different slopes in different species and this dependent on evolution.
>
> Another group used a more arbitrary method. As DNA's 4 bases are of 1 of 2 different types (purinic or pirimidinic) they construct a 1-dimensional random walk based on whether the next base is of one type or another (thus going up or down). This walk is then submitted to something called R/S analysis in which the sequence is divided in chunks, the increments in the sequence calculated and then a plot of log(segment size) vs log(range / standard deviation of the increments) (hence R/S). Slopes (Hurst exponent) of 0.5 are characteristic of brownian motion (which is the integral of white noise), larger than than make the sequence "persistent" and smaller than 0.5 make it "anti-persistent".
> The only sensible thing here would be to make the random walk embedded in the 4-base space, but apparently if you do that, what they try to show does not always show up (!). Note that this "base- type" encoding is quite arbitrary because it is not the same thing to switch the bases according to their type alone. This is like saying that one can re-code a language based on whether the letters have "roundy" bits (a,o,d,b,q,p) or not (i,t,y,x, etc..) (so take your own conclusions).
>
> Anyway, that group claims that non-coding areas of DNA (the so- called junk DNA) have long range correlations, while the coding (genes) do not. The finding is interesting, but to date I do not think there are any clues about the meaning or relevance of this. Why? because that "random walk" does not fully carry the message of DNA.
>
> Of course this can be applied to the VMS, but the only problem is that we know that there are a number of pages missing, so our sequence is not continuous. It may be interesting to try anyway.
>
> I've been about to do some of this since I joined the list. Perhaps it is time...
>
> cheers,
>
> Gabriel

From jim@mail.rand.org  Sat Jan 22 12:32:23 2000
Return-Path: <jim@mail.rand.org>
Received: from mail-green.research.att.com (mail-green.research.att.com [135.207.30.103])
	by fry.research.att.com (980427.SGI.8.8.8/8.8.7) with ESMTP id MAA86723
	for <reeds@fry.research.att.com>; Sat, 22 Jan 2000 12:32:23 -0500 (EST)
Received: by mail-green.research.att.com (Postfix)
	id EC6F91E018; Sat, 22 Jan 2000 12:32:23 -0500 (EST)
Delivered-To: reeds@research.att.com
Received: from mail01-lax.pilot.net (mail-lax-1.pilot.net [205.139.40.18])
	by mail-green.research.att.com (Postfix) with ESMTP id 443C51E016
	for <reeds@research.att.com>; Sat, 22 Jan 2000 12:32:22 -0500 (EST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail01-lax.pilot.net with ESMTP id JAA18988; Sat, 22 Jan 2000 09:32:18 -0800 (PST)
Received: from mail.rand.org (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id JAA11605; Sat, 22 Jan 2000 09:32:17 -0800 (PST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail.rand.org (8.9.3/8.9.3) with ESMTP id JAA24554 for <voynich@rand.org>; Sat, 22 Jan 2000 09:32:06 -0800 (PST)
Received: from mail01-lax.pilot.net (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id JAA11551 for <voynich@rand.org>; Sat, 22 Jan 2000 09:32:05 -0800 (PST)
Received: from mail.nctimes.net (mail.nctimes.net [208.239.16.66]) by mail01-lax.pilot.net with ESMTP id IAA16757 for <voynich@rand.org>; Sat, 22 Jan 2000 08:53:56 -0800 (PST)
Received: from nctimes.net ([208.239.20.108]) by mail.nctimes.net
          (Netscape Messaging Server 3.5)  with ESMTP id AAA3B3D;
          Sat, 22 Jan 2000 08:50:18 -0800
Message-ID: <3889E06D.BEAD3B4B@nctimes.net>
Date: Sat, 22 Jan 2000 08:53:01 -0800
From: Mark Perakh <perakh@nctimes.net>
Reply-To: perakh@nctimes.net
Organization: home
X-Mailer: Mozilla 4.7 [en] (Win98; I)
X-Accept-Language: en,ru
MIME-Version: 1.0
To: stolfi@dcc.unicamp.br
Cc: voynich@rand.org
Subject: Re: LSC and the VMS
References: <200001221227.KAA02811@coruja.dcc.unicamp.br>
Content-Type: text/plain; charset=koi8-r
Content-Transfer-Encoding: 7bit
Sender: jim@mail.rand.org
Status: O

Jorge, I fully agree with your conclusion.  Indeed, LSC test revealed in
VMS features identical with meaningful texts we explored. On the other
hand, if we assume that each voinichese symbol is a letter, then the
letter frequency distribution in VMS is much more non-uniform than in any
of 12 languages we tested. Furthermore, in one of my papers you can see
the LSC results obtained for a gibberish which I created by hitting
(suposedly randomly) the keys on a keyboard.  It has some features of a
meaningful texts, but also has some subtle differences from meaningful
texts.  You probably noticed that my conclusion was that, if we rely on
LSC data, VMS can be either meaningful or a result of a very sophisticated
effort to imitate a meaningful text, in which even the relative
frequencies of vowels and consonants have been skilfully faked.  I can
hardly imagine such an extraordinarily talented and diligent forger, so I
am inclined to guess VMS is a meaningful text, but some doubts remain.
Moreover, if VMS symbols are not individual letters, all LSC results hang
in the air. Best, Mark

Jorge Stolfi wrote:

> Mark's LSC tests applied to the VMS give results typical of natural
> languages, and quite different from those of monkey text.
>
> This is very good news, at least for those of us who still believe
> that there is a text in there to be read. As for myself, I have
> remarked several times in the past that the distribution of words in
> the VMS seemed to be far from uniform; it is nice to see that vague
> feeling turned into a quantitative measurement.
>
> Unfortunately, even this powerful test still leves some room
> for doubt.
>
> For one thing, while the LSC can unmask ordinary monkeys, it too can
> be fooled with relative ease, once one realizes how it works. One
> needs only to build a `multiscale monkey' that varies the frequencies
> of the letters along the text, in a fractal-like manner.
>
> Of course, it is hard to imagine a medieval forger being aware of
> fractal processes. However, he could have used such a process without
> knowing it. For instance, he may have copied an arabic book, using
> some fancy mapping of arabic letters to Voynichese alphabet. The
> mapping would not have to be invertible, or consistently applied: as
> long as the forger mantained some connection between the original text
> and the transcript, the long-range frequency variations of the former
> would show up in the latter as well.
>
> Moreover, I suspect that any nonsense text that is generated `by hand'
> (i.e. without the help of dice or other mechanical devices) will
> show long-range variations in letter frequencies at least as
> strong as those seen in meaningful texts.
>
> Thus Mark's results do not immediately rule out random but
> non-mechanical babble or glossolalia. However, it is conceivable that
> such texts will show *too much* long-range variation, instead of too
> little. We really need some samples...
>
> All the best,
>
> --stolfi

From jim@mail.rand.org  Sat Jan 22 17:18:43 2000
Return-Path: <jim@mail.rand.org>
Received: from mail-blue.research.att.com (mail-blue.research.att.com [135.207.30.102])
	by fry.research.att.com (980427.SGI.8.8.8/8.8.7) with ESMTP id RAA76361
	for <reeds@fry.research.att.com>; Sat, 22 Jan 2000 17:18:43 -0500 (EST)
Received: by mail-blue.research.att.com (Postfix)
	id 93AFE4CE11; Sat, 22 Jan 2000 17:18:43 -0500 (EST)
Delivered-To: reeds@research.att.com
Received: from mail01-lax.pilot.net (mail-lax-1.pilot.net [205.139.40.18])
	by mail-blue.research.att.com (Postfix) with ESMTP id E7B994CE0C
	for <reeds@research.att.com>; Sat, 22 Jan 2000 17:18:42 -0500 (EST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail01-lax.pilot.net with ESMTP id OAA05674; Sat, 22 Jan 2000 14:18:34 -0800 (PST)
Received: from mail.rand.org (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id OAA16757; Sat, 22 Jan 2000 14:18:33 -0800 (PST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail.rand.org (8.9.3/8.9.3) with ESMTP id OAA00355 for <voynich@rand.org>; Sat, 22 Jan 2000 14:18:16 -0800 (PST)
From: mskala@ansuz.sooke.bc.ca
Received: from mail02-lax.pilot.net (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id OAA16732 for <voynich@rand.org>; Sat, 22 Jan 2000 14:18:15 -0800 (PST)
Received: from ansuz.sooke.bc.ca (bbs.bbc.org [139.142.115.249]) by mail02-lax.pilot.net with ESMTP id OAA25270 for <voynich@rand.org>; Sat, 22 Jan 2000 14:18:13 -0800 (PST)
Received: from localhost (mskala@localhost)
	by ansuz.sooke.bc.ca (8.9.3/8.8.7) with ESMTP id KAA20911;
	Sat, 22 Jan 2000 10:43:52 -0800
Date: Sat, 22 Jan 2000 10:43:52 -0800 (PST)
To: Mark Perakh <perakh@nctimes.net>
Cc: stolfi@dcc.unicamp.br, voynich@rand.org
Subject: Re: LSC and the VMS
In-Reply-To: <3889E06D.BEAD3B4B@nctimes.net>
Message-ID: <Pine.LNX.4.10.10001221035350.20874-100000@ansuz.sooke.bc.ca>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: jim@mail.rand.org
Status: O

On Sat, 22 Jan 2000, Mark Perakh wrote:
> Moreover, if VMS symbols are not individual letters, all LSC results hang
> in the air. Best, Mark

In order to draw meaningful conclusions from statistics, we really need
statistics that are stable when we change the arbitrary decisions (such as
where letters begin and end) - as I pointed out on this list in Nov 1998.
Unfortunately, I haven't been able to come up with any such statistics,
but I do have some hopes for LSC.

The fundamental problem is that we need to compare large-scale
correlations with small-scale correlations, *without* knowing exactly how
large or small the scales are except in relation to each other.  I mean,
if I take one "word" of the VMS, I don't know if that's really the size of
linguistic unit we normally call a word, if it's more like a "letter", or
if it's (like the Masonic reminder books someone mentioned, they have a
proper name which I forget) it's an acronym that could abbreviate a whole
sentence.  All I know is that there are one hundred VMS "words" in a
passage one hundred VMS "words" long.

To be useful, statistics have to look at relative scale instead of
absolute scale, and the LSC looks good to me because it seems to do that.

Matthew Skala                       "Ha!" said God, "I've got Jon Postel!"
mskala@ansuz.sooke.bc.ca            "Yes," said the Devil, "but *I've* got
http://www.islandnet.com/~mskala/    all the sysadmins!"

From jim@mail.rand.org  Sun Jan 23 11:03:05 2000
Return-Path: <jim@mail.rand.org>
Received: from mail-blue.research.att.com (mail-blue.research.att.com [135.207.30.102])
	by fry.research.att.com (980427.SGI.8.8.8/8.8.7) with ESMTP id LAA31396
	for <reeds@fry.research.att.com>; Sun, 23 Jan 2000 11:03:05 -0500 (EST)
Received: by mail-blue.research.att.com (Postfix)
	id 225E64CF21; Sun, 23 Jan 2000 11:03:05 -0500 (EST)
Delivered-To: reeds@research.att.com
Received: from mail01-lax.pilot.net (mail-lax-1.pilot.net [205.139.40.18])
	by mail-blue.research.att.com (Postfix) with ESMTP id 7C7964CF1B
	for <reeds@research.att.com>; Sun, 23 Jan 2000 11:03:04 -0500 (EST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail01-lax.pilot.net with ESMTP id IAA16476; Sun, 23 Jan 2000 08:02:01 -0800 (PST)
Received: from mail.rand.org (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id IAA00447; Sun, 23 Jan 2000 08:02:00 -0800 (PST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail.rand.org (8.9.3/8.9.3) with ESMTP id IAA14927 for <voynich@rand.org>; Sun, 23 Jan 2000 08:01:28 -0800 (PST)
Received: from mail01-lax.pilot.net (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id IAA00397 for <voynich@rand.org>; Sun, 23 Jan 2000 08:01:26 -0800 (PST)
Received: from mailout02.sul.t-online.de (mailout02.sul.t-online.de [194.25.134.17]) by mail01-lax.pilot.net with ESMTP id IAA16448 for <voynich@rand.org>; Sun, 23 Jan 2000 08:01:25 -0800 (PST)
Received: from fwd02.sul.t-online.de 
	by mailout02.sul.t-online.de with smtp 
	id 12CPS8-0006wh-0A; Sun, 23 Jan 2000 17:01:24 +0100
Received: from  (0625764225-0001@[193.159.5.27]) by fwd02.sul.t-online.de
	with smtp id 12CPS7-0TZJ44C; Sun, 23 Jan 2000 17:01:23 +0100
From: Zandbergen@t-online.de (Rene)
To: voynich@rand.org
References: <200001221227.KAA02811@coruja.dcc.unicamp.br>
Subject: Re: LSC and the VMS
X-Mailer: T-Online eMail 2.3
MIME-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 8BIT
Date: Sun, 23 Jan 2000 17:01:23 +0100
Message-ID: <12CPS7-0TZJ44C@fwd02.sul.t-online.de>
X-Sender: 0625764225-0001@t-dialin.net
Sender: jim@mail.rand.org
Status: O

Stolfi wrote:

> Mark's LSC tests applied to the VMS give results typical of natural
> languages, and quite different from those of monkey text.
>
> This is very good news, at least for those of us who still believe
> that there is a text in there to be read. As for myself, I have
> remarked several times in the past that the distribution of words in
> the VMS seemed to be far from uniform; it is nice to see that vague
> feeling turned into a quantitative measurement.
>
> Unfortunately, even this powerful test still leaves some room
> for doubt.  
>
> For one thing, while the LSC can unmask ordinary monkeys, it too can
> be fooled with relative ease, once one realizes how it works....

This is my feeling exactly (although I am not yet sure about _how_
easy it would be). In order to have real, strong evidence that the
VMs contains meaningful text, we need to know how one can create a
'meaningless' text that still exhibits the same properties as meaningful
text. More to the point: we need to find a mechanism that could have been
applied 400-500 years ago.

Jacques already pointed out that we don't actually know how to define
meaningful and meaningless. This may well prove to be a serious problem.
When trying to generate meaningless texts which the LSC would classify
as mneaningful, or vice versa, we're likely to end up in the no-man's
land bordering on the two....
Take a meaningful text and start removing words (every 10th, every 2nd,
at random...). When does the text stop being meaningful?
How does the LSC curve behave?

I'll take a bit more time replying to Stolfi's earlier post.
Important is that the LSC test identifies texts as meaningful if for
medium-size chunks the correlation between letter frequencies is *higher*
than for random texts, while for longer chunks this correlation is actually
*lower*.

Cheers, Rene

From jim@mail.rand.org  Mon Jan 24 00:18:29 2000
Return-Path: <jim@mail.rand.org>
Received: from mail-green.research.att.com (mail-green.research.att.com [135.207.30.103])
	by fry.research.att.com (980427.SGI.8.8.8/8.8.7) with ESMTP id AAA12910
	for <reeds@fry.research.att.com>; Mon, 24 Jan 2000 00:18:19 -0500 (EST)
Received: by mail-green.research.att.com (Postfix)
	id 6A90A1E136; Sun, 23 Jan 2000 11:31:39 -0500 (EST)
Delivered-To: reeds@research.att.com
Received: from mail02-lax.pilot.net (mail-lax-2.pilot.net [205.139.40.16])
	by mail-green.research.att.com (Postfix) with ESMTP id 144A41E029
	for <reeds@research.att.com>; Sun, 23 Jan 2000 11:31:27 -0500 (EST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail02-lax.pilot.net with ESMTP id IAA21469; Sun, 23 Jan 2000 08:30:15 -0800 (PST)
Received: from mail.rand.org (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id IAA00858; Sun, 23 Jan 2000 08:30:12 -0800 (PST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail.rand.org (8.9.3/8.9.3) with ESMTP id IAA15456 for <voynich@rand.org>; Sun, 23 Jan 2000 08:30:04 -0800 (PST)
Received: from mail01-lax.pilot.net (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id IAA00824 for <voynich@rand.org>; Sun, 23 Jan 2000 08:30:03 -0800 (PST)
Received: from mailer3.bham.ac.uk (mailer3.bham.ac.uk [147.188.128.54]) by mail01-lax.pilot.net with ESMTP id IAA17552 for <voynich@rand.org>; Sun, 23 Jan 2000 08:28:40 -0800 (PST)
Received: from bham.ac.uk ([147.188.128.127])
	by mailer3.bham.ac.uk with esmtp (Exim 3.02 #16)
	id 12CPsV-0005Fu-00
	for voynich@rand.org; Sun, 23 Jan 2000 16:28:39 +0000
Received: from is-fs13.bham.ac.uk ([147.188.128.55])
	by bham.ac.uk with esmtp (Exim 3.10 #1)
	id 12CPsV-00008Q-00
	for voynich@rand.org; Sun, 23 Jan 2000 16:28:39 +0000
Received: from BHAM-IS-FS13/SpoolDir by is-fs13.bham.ac.uk (Mercury 1.44);
    23 Jan 00 16:28:39 +0000
Received: from SpoolDir by BHAM-IS-FS13 (Mercury 1.44); 23 Jan 00 16:28:32 +0000
Received: from oemcomputer (147.188.135.3) by is-fs13.bham.ac.uk (Mercury 1.44) with ESMTP;
    23 Jan 00 16:28:23 +0000
From: "Gabriel Landini" <G.Landini@bham.ac.uk>
Organization: The University of Birmingham, U.K.
To: voynich@rand.org
Date: Sun, 23 Jan 2000 16:28:44 -0000
MIME-Version: 1.0
Content-type: text/plain; charset=US-ASCII
Content-transfer-encoding: 7BIT
Subject: Re: LSC and the VMS
Reply-To: G.Landini@bham.ac.uk
In-reply-to: <200001221227.KAA02811@coruja.dcc.unicamp.br>
X-mailer: Pegasus Mail for Win32 (v3.12b)
Message-ID: <492C63594A@is-fs13.bham.ac.uk>
Sender: jim@mail.rand.org
Status: OR

On 22 Jan 00, at 10:27, Jorge Stolfi wrote:
> Moreover, I suspect that any nonsense text that is generated `by hand'
> (i.e. without the help of dice or other mechanical devices) will
> show long-range variations in letter frequencies at least as 
> strong as those seen in meaningful texts.

Your suspicion is correct. I remember reading a paper where they 
showed a human-written sequence of random numbers which had 
long range correlations.

I can post the reference on Monday if anybody is interested. 
I also remember that the power spectrum of the equal-symbol 
sequence of the number Pi in the decimal expansion (coded as in 
the 1st example I gave on DNA) is flat for a very large number of 
digits (I will post this too, but I got the reference at work).

Cheers,
Gabriel

From jim@mail.rand.org  Sun Jan 23 17:21:15 2000
Return-Path: <jim@mail.rand.org>
Received: from mail-blue.research.att.com (mail-blue.research.att.com [135.207.30.102])
	by fry.research.att.com (980427.SGI.8.8.8/8.8.7) with ESMTP id RAA08393
	for <reeds@fry.research.att.com>; Sun, 23 Jan 2000 17:21:15 -0500 (EST)
Received: by mail-blue.research.att.com (Postfix)
	id 0C34F4CFD4; Sun, 23 Jan 2000 12:57:42 -0500 (EST)
Delivered-To: reeds@research.att.com
Received: from mail01-lax.pilot.net (mail-lax-1.pilot.net [205.139.40.18])
	by mail-blue.research.att.com (Postfix) with ESMTP id 642534CFB0
	for <reeds@research.att.com>; Sun, 23 Jan 2000 12:57:41 -0500 (EST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail01-lax.pilot.net with ESMTP id JAA21748; Sun, 23 Jan 2000 09:56:39 -0800 (PST)
Received: from mail.rand.org (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id JAA02486; Sun, 23 Jan 2000 09:56:38 -0800 (PST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail.rand.org (8.9.3/8.9.3) with ESMTP id JAA16948 for <voynich@rand.org>; Sun, 23 Jan 2000 09:56:21 -0800 (PST)
Received: from mail03-lax.pilot.net (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id JAA02465 for <voynich@rand.org>; Sun, 23 Jan 2000 09:56:20 -0800 (PST)
Received: from mail.nctimes.net (mail.nctimes.net [208.239.16.66]) by mail03-lax.pilot.net with ESMTP id JAA22067 for <voynich@rand.org>; Sun, 23 Jan 2000 09:56:20 -0800 (PST)
Received: from nctimes.net ([208.239.20.159]) by mail.nctimes.net
          (Netscape Messaging Server 3.5)  with ESMTP id AAA1B5;
          Sun, 23 Jan 2000 09:52:41 -0800
Message-ID: <388B4068.C18A1B8E@nctimes.net>
Date: Sun, 23 Jan 2000 09:54:48 -0800
From: Mark Perakh <perakh@nctimes.net>
Reply-To: perakh@nctimes.net
Organization: home
X-Mailer: Mozilla 4.7 [en] (Win98; I)
X-Accept-Language: en,ru
MIME-Version: 1.0
To: Rene <Zandbergen@t-online.de>
Cc: voynich@rand.org
Subject: Re: LSC and the VMS
References: <200001221227.KAA02811@coruja.dcc.unicamp.br> <12CPS7-0TZJ44C@fwd02.sul.t-online.de>
Content-Type: text/plain; charset=koi8-r
Content-Transfer-Encoding: 7bit
Sender: jim@mail.rand.org
Status: OR



Rene wrote:

>
>
>
> Jacques already pointed out that we don't actually know how to define
> meaningful and meaningless.

 Rene, if you have in mind a rigorous mathematical definition of meaningful vs
meaningless, it may be indeed a hard task (but certainly possible). However,
for the beginning we just may be satisfied with the fact that when we see a
meaningful text we recogfnize it as such (if we know the language). Of course,
when you deal with an unknown language, the situation is different, but what we
can do is to test various forms of meaningless texts, compare them to a variety
of meaningful exts in various languages, and try to determine common features
in each of those two types of texts. LSC can probably be useful as one of the
tools which though better has to be complemented by other tools.  I tried
something in that vein using LSC plus letter frequencies distributions.  The
results, as I agree, have not been really conclusive, so some additonal tools
have to be invented.  Even if neither of such tools will by itself be
sufficient to exclude doubts, in their totality they at some moment  should
provide overhwelming evidence in favor of text being either meaningful  or a
gibberish.  What tools?  I can think of several. One example.  Once, some time
ago,  the Biblical texts were tested by comparing the frequencies of words in
the left and in the right halves of the verses.  There was obvious correlation
which disappeared when the text was letter-permuted. No precise math measure of
the above correlation was derived, , but there are in the list at least a few
guys who are capable to take over that idea and develop it. Another obvious
correlation which disappeared when a text was permuted was between the first
one, or two, or three letters of each consecutive word, if observed along the
entire text. There were also noticed some other similar correlations. All these
things were tried before LSC has become the first choice. These observations
were never developed beyond the few preliminary trials, never properly
quantisized, and never well recorded.  But they are real and could be developed
to the same extent as LSC was and provide additional info. Of course, many
other (fresh) ideas can be suggested to analyze texts.  Somebody needs to spend
time on that. Cheers, Mark



From jim@mail.rand.org  Mon Jan 24 05:37:18 2000
Return-Path: <jim@mail.rand.org>
Received: from mail-blue.research.att.com (mail-blue.research.att.com [135.207.30.102])
	by fry.research.att.com (980427.SGI.8.8.8/8.8.7) with ESMTP id FAA59042
	for <reeds@fry.research.att.com>; Mon, 24 Jan 2000 05:37:17 -0500 (EST)
Received: by mail-blue.research.att.com (Postfix)
	id 1489E4CE09; Mon, 24 Jan 2000 05:37:17 -0500 (EST)
Delivered-To: reeds@research.att.com
Received: from mail03-lax.pilot.net (mail-lax-3.pilot.net [205.139.40.17])
	by mail-blue.research.att.com (Postfix) with ESMTP id 0B5C74CE08
	for <reeds@research.att.com>; Mon, 24 Jan 2000 05:37:16 -0500 (EST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail03-lax.pilot.net with ESMTP id CAA04848; Mon, 24 Jan 2000 02:37:10 -0800 (PST)
Received: from mail.rand.org (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id CAA21347; Mon, 24 Jan 2000 02:37:09 -0800 (PST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail.rand.org (8.9.3/8.9.3) with ESMTP id CAA06627 for <voynich@rand.org>; Mon, 24 Jan 2000 02:36:41 -0800 (PST)
Received: from mail03-lax.pilot.net (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id CAA21326 for <voynich@rand.org>; Mon, 24 Jan 2000 02:36:40 -0800 (PST)
Received: from grande.dcc.unicamp.br (grande.dcc.unicamp.br [143.106.7.8]) by mail03-lax.pilot.net with ESMTP id CAA04715 for <voynich@rand.org>; Mon, 24 Jan 2000 02:36:33 -0800 (PST)
Received: from amazonas.dcc.unicamp.br (amazonas.dcc.unicamp.br [143.106.7.11])
	by grande.dcc.unicamp.br (8.9.3/8.9.3) with ESMTP id IAA27488
	for <voynich@rand.org>; Mon, 24 Jan 2000 08:35:59 -0200 (EDT)
Received: from coruja.dcc.unicamp.br (coruja.dcc.unicamp.br [143.106.24.80])
	by amazonas.dcc.unicamp.br (8.8.5/8.8.5) with ESMTP id IAA02350
	for <voynich@rand.org>; Mon, 24 Jan 2000 08:35:55 -0200 (EDT)
Received: (from stolfi@localhost)
	by coruja.dcc.unicamp.br (8.8.5/8.8.5) id IAA04171;
	Mon, 24 Jan 2000 08:35:55 -0200 (EDT)
Date: Mon, 24 Jan 2000 08:35:55 -0200 (EDT)
Message-Id: <200001241035.IAA04171@coruja.dcc.unicamp.br>
From: Jorge Stolfi <stolfi@dcc.unicamp.br>
To: voynich@rand.org
Subject: Yet more bean-counting: [aoy]
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
Content-Type: text/plain; charset=iso-8859-1
Reply-To: stolfi@dcc.unicamp.br
Sender: jim@mail.rand.org
Status: OR


Introduction
------------

  In previous notes about the structure of Voynichese words, I have
  been ignoring the `circle' letters O = { a o y }. This note looks at the
  distribution of the O-letters within the words.

The word paradigm
-----------------

  As you may recall, my Voynichese word paradigm (ignoring circle
  letters) has the form

    Q?1 D?1 X?2 M?1 X?2 R?2 

  where the notation A?n means from zero to n instances of A,
  and

    Q = { q }

    M = { k t p f } (the `gallows'), possibly preceded by "I" or "c" and/or
        followed by "h" and/or "e".

    X = { ch sh ee } (the `benches'), possibly followed by one "e" 

    R  = D + F

    D = { d l r s x v } (the `dealers'), possibly preceded by "i"s and/or
        followed by "e"

    F = { n m g j } (the `finals'), also possibly preceded by "i"s and/or
        followed by "e"

  I will use the term `element' to mean any of these letters with
  the attached [iceh] modifiers. 

  The paradigm implies that a word has a three-layer structure, with a
  `core' of gallows elements, a `mantle' of benches, and a `crust' of
  dealers and finals. Any layer may be empty, but if present it must be
  a contiguous substring of elements, adjacent to or surrounding the
  deeper layers. In particular the paradigm forbids words with more than
  one M-letter, or two X- or M-letters separated by an R letter.

  (Beware that my notation and nomenclature has been changing through
  these notes. Sorry for the confusion, but these are *working*
  notes...)

Circles are not doubled
-----------------------

  Implicit in the paradigm is the rule that circle letters can only be
  inserted before or after an element, not within it (e.g. not between
  a "k" and its modifying "e"). Thus a word with N elements has N+1
  `slots' where the circle letters could be inserted.

  There are about 51972 O-letters in the sample text, and about 109672
  possible O-slots between elements. These slots are occupied as follows
  
    0 circle letters:  58207 (53.1%)
    1 circle letter:   50819 (46.3%)
    2 circle letters:    501  (0.4%)
    3 circle letters:      3  (0.0%)
    
  There are no instances of 4 or more circles in a row, except for 
  the `primeval scream' atop one of the cosmo diagrams. (There are
  also 142 anomalous inter-element strings, such as "oe". We will
  ignore them for now.)
  
  Note that there is a definite dislike for two or more O-letters in a
  row. If there was no restriction about the interleaving of O's and
  other elements, then 62% of the slots would be empty, 29% would
  contain one circle, 7% (i.e. over 7000) would contain two circles,
  1% (over 1000) would have three, 0.1% would have 4, and so on.
  
Distribution of circles in local context
----------------------------------------

  The following table shows the occurences of O-strings according to
  the two adjacent elements. The letters { a y } have been mapped to
  "o" to make the table shorter. Word boundaries are denoted "#" and
  empty O-strings by "_".
  
                Inter-element string
                -----------------------------
      Context       _     o    oo   ooo other
      -------   ----- ----- ----- ----- -----
      #*#       -N/A-   240    18     .     3
      M*#         189  2896    32     1     3
      X*#         110  4942    51     .     3
      R*#       19274  7311    13     .     2

      #*R        6402  4748   174     1    33
      R*R         838  6592    34     .     6
      X*R        5086  4749    69     .     2
      M*R        1402  7058    64     1     6

      #*X        8899   577     9     .     1
      R*X        1755    53     .     .     .
      X*X        1294    37     .     .     .
      M*X        6186    95     .     .     1

      #*M        3635 10212    25     .    40
      R*M        1237   159     5     .     7
      X*M        1633   894     5     .    29
      M*M          11   114     .     .     6

      other       166   142     2     .     .

      TOTAL     58207 50819   501     3   142 

  (The "other" counts are letter groups such as "oe", "shh", "ich",
  detached [ice], etc. which cannot be parsed into the standard set of
  elements.)
  
  Note again that, overall, half of the O-slots are empty, and half
  are occupied by "o". If the placement of the "o"s were independent
  of the context, we should expect to see the same 1:1 ratio between
  the first two numbers in each row. We see instead that the contexts
  M*X, X*X, R*X, #*X strongly repel O-strings (ratios 65:1, 35:1,
  33:1, 15:1, respectively), while X*#, M*#,and R*R strongly attract
  them (ratios 1:45, 1:15, and 1:8, respectively).
  
  These numbers suggest that an O letter is either word-final, or a
  modifier for the following R or M letter (but not X letter). Indeed,
  of the 50819 instances of isolated "o", 49675 instances (97.7%) are
  in one of these contexts. However, this cannot be taken as an axiom,
  because, of of the 18905 O-slots that are followed by an X element,
  771 (4%) are filled --- a percentage which is too high to ignore. So
  the truth must be more complicated than that.
  
Location of circles in the word paradigm 
----------------------------------------

  Let's say that a word is `hard' if it has a non-empty core and/or
  mantle, and `soft' otherwise.
  
  In a hard word we can isolate a maximal `prefix' and a maximal
  `suffix' consisting of non-core, non-mantle letters --- namely,
  dealers, finals, circles, and any [ie] modifiers. Thus, for example,
  the hard word "orckhocheody" can be split into prefix "or", suffix
  "ody", and core-mantle "ckhoche".

  Note that a prefix, suffix, or soft word with N non-circle elements
  has N+1 slots where circles could be inserted, while a core-mantle
  with N non-circle elements has N-1 such slots. The following table
  shows the counts of empty and occupied circle slots in the three
  parts of hard words.

    soft words:    22435 O-slots,  9952 occupied (44%)
    prefixes:      29078 O-slots, 12082 occupied (42%)
    suffixes:      46322 O-slots, 27572 occupied (60%)
    core-mantles:  11133 O-slots,  1534 occupied (14%)

  Thus we see that the O-letters stronly avoid the interior of
  core-mantles. In fact, if we look closely, we find that most of the
  filled O-slots in core-mantles are combinations "Xo" that precedes
  the core, as in "chokedy" or "shchotchy"; or in `invalid'
  core-mantles (with more than one M, and/or with R intrusions).
  Here are the numbers:
  
    valid core-mantles with O-slots: 9023
      without O-insertions:    8076 (89.5%)
      with "Xo" before core:    778  (8.6%)
      with "y" insertions:       80  (0.8%)
      with other O-insertions:   89  (0.9%)

    invalid core-mantles with O-slots: 655
      without O-insertions:     109 (16.6%)
      with "Xo" before core:     94 (14.3%) 
      with "y" intrusions:       52  (7.9%)
      with other O-insertions:  400 (61.1%)
      
  Note that "y" is almost always word-initial or word-final, so an
  intra-word "y" is probably the result of omitted word space.
  So the 89 valid coremantles with other O-insertions may well be
  due to the same cause.
  
  Moreover, the enhanced frequency of "y" inside invalid core-mantles
  suggests that these too are the result of joined words. So the
  400 invalid core-mantles with other O-insertions are not significant. 
    
  In short, the circles are found mostly in the `crust' of words,
  except for some 800 instances of "cho" and "sho" sequences in the
  first half of the mantle. 

Relationship between O- and R-letters
-------------------------------------

  Let's look more closely at the interleaving of O and R letters in
  the crust of words. That means about 8800 soft (crust-only) words,
  as well as the prefixes and suffixes of about 26,000 hard words.
  
  First, let's classify those strings according to the number of
  R's and the number of O's:
  
    
    SOFT WORDS

                                                    O-letters in word
                            -----------------------------------------
      R-letters in word         0      1      2      3      4   Total
      --------------------  -----  -----  -----  -----  -----   -----
      0 R-letters               -    240     18      .      .     258
      1 R-letter              475   3000    387      5      .    3867
      2 R-letters              62   3113    936     63      3    4177
      3 R-letters               7     63    283     55      7     415
      4 R-letters               1      4     24      6      1      36
      5 R-letters               .      .      .      1      .       1

      Total                   545   6420   1648    130     11    8754
      Rel. percent            6.2%  73.3%  18.8%   1.5%   0.1%  100.0%
      Abs. percent            1.6%  18.3%   4.7%   0.4%   0.0%   24.9%
      
      Average number of R-letters: 1.56
      Average number of O-letters: 1.15


    PREFIXES

                                           O-letters in prefix
                            ----------------------------------
      R-letters in prefix       0      1      2      3   Total
      --------------------  -----  -----  -----  -----   -----
      0 R-letters           12534  10789     34      .   23357
      1 R-letter             1546   1035     30      .    2611
      2 R-letters              10    134     13      1     158
      3 R-letters               1      .      1      .       2

      Total                 14091  11958     78      1   26128
      Rel. percent           53.9%  45.8%   0.3%   0.0%  100.0%
      Abs. percent           40.4%  34.3%   0.2%   0.0%   74.9%


      Average number of R-letters: 0.11
      Average number of O-letters: 0.46

    SUFFIXES

                                                  O-letters in suffix 
                            -----------------------------------------
      R-letters in suffix       0      1      2      3      4   Total
      --------------------  -----  -----  -----  -----  -----   -----
      0 R-letters             299   7838     83      1      .    8221   
      1 R-letter              641  13857   1377     10      1   15886   
      2 R-letters              29    853    894     69      2    1847             
      3 R-letters               5     10     73     29      2     119   
      4 R-letters               .      .      1      .      1       2

      Total                   974  22558   2428    109      6   26075             
      Rel.percent             3.7%  86.5%   9.3%   0.4%   0.0%  100.0%
      Abs. percent            2.8%  64.7%   7.0%   0.3%   0.0%   74.8%
 
      Average number of R-letters: 0.76
      Average number of O-letters: 1.06

  (The absolute percentages are relative to the total number of words
  in the text. These counts do not include those soft words, prefixes,
  and suffixes --- about 120 of each -- that contain invalid elements
  such as "shh", "oq", unattached "i" or "e", etc.. Hence the
  discrepancy between the totals for prefixes and suffixes.)

  Here are the counts (total and in major sections) of individual
  crust patterns, with the R-letters mapped ot "R" and the O-letters
  mapped to "o" (so, for example, "daiin" becomes "RoR", and "doaro"
  becomes "RooRo"):

    SOFT WORDS

         tot   pha.2   hea.1   cos.2   zod.1   heb.1   str.2   bio.1  pattern
      ------  ------  ------  ------  ------  ------  ------  ------  -------
        3030     136     821     129      34     223     579     605  RoR
        2605      87     186     123      97     230     842     563  oR
         722      34     101      24      17      70     239      98  oRoR
         475      15     149      17      17      34      46      45  R
         395      18     141      21      10      41      32      80  Ro
         240       3      48      21      15      16      40      38  o
         226       9      24      12       9      26      43      67  oRo
         187       7      25      17       2      18      55      21  RoRoR
         155      10      38       3       .      11      52       7  ooR
         144       3      37      10       2      15      27      28  RoRo
          62       .       8       4       .       6      20      12  RR
          54       4       9       2       2       8      16       6  oRR
          51       2       6       4       3       4      16       7  oRoRo
          47       3       5       4       2       7       7      13  oRRo
          47       3      13       2       .      11       4       .  oRRoR
          31       2       2       .       1       1      14       6  RRoR
          30       2      15       .       1       5       1       1  RoRR
          30       .       6       1       .       5       5       7  RoRRo
          29       .       5       1       .       3       8      10  RRo
          21       1       5       2       1       2       1       3  RoRoRo
          20       2       4       3       .       .       9       1  RooR
          19       2       .       .       1       1       7       3  oRoRoR
          18       .       6       1       .       2       6       1  oo
          15       .       4       1       .       2       3       .  RoRRoR
          14       1       4       .       .       1       6       .  oRoRR
           9       1       2       .       .       3       1       1  oRoRRo
           7       .       .       1       .       1       3       1  RRR
           7       .       .       .       .       .       2       1  RoRoRR
           7       1       1       1       .       1       1       .  ooRoR
           6       .       2       1       .       .       1       .  Roo
           6       1       .       .       .       .       1       .  oRoRoRo
           4       .       .       .       .       .       2       1  RRoRo
           4       .       1       1       .       1       1       .  RoRoRoR
           3       .       .       .       1       1       1       .  oRoo
           2       .       .       .       .       .       2       .  RRRo
           2       .       .       .       .       .       1       1  RRoRoR
           2       .       1       .       .       .       1       .  RoRRR
           2       .       .       .       1       .       .       .  RoRoRRo
           2       .       1       .       .       .       1       .  RooRo
           2       .       .       .       .       1       .       1  oRRoRo
           2       .       .       .       .       1       1       .  oRooR
           2       .       .       .       1       .       1       .  ooRR
           2       1       .       .       .       .       1       .  ooRRoR
           2       .       .       .       1       .       .       .  ooRo
           1       .       .       .       .       .       1       .  RRRR
           1       .       .       .       .       .       .       1  RRRoR
           1       .       1       .       .       .       .       .  RRoRR
           1       .       .       .       1       .       .       .  RRoo
           1       .       1       .       .       .       .       .  RoRRoRoR
           1       .       .       .       1       .       .       .  RoRoRoRo
           1       .       .       .       .       .       1       .  RoRooR
           1       .       .       .       .       .       1       .  oRRRo
           1       1       .       .       .       .       .       .  oRoRoo
           1       1       .       .       .       .       .       .  oRooRo
           1       .       1       .       .       .       .       .  ooRRo
           1       .       1       .       .       .       .       .  ooRRoRo
           1       .       .       .       .       .       1       .  ooRoRR
           1       .       1       .       .       .       .       .  oooRoR
      ------  ------  ------  ------  ------  ------  ------  ------  -------
        8754     350    1675     406     220     751    2103    1629  Total

    PREFIXES

         tot   pha.2   hea.1   cos.2   zod.1   heb.1   str.2   bio.1  pattern
      ------  ------  ------  ------  ------  ------  ------  ------  -------
       12534     490    3081     420     207    1048    3365    2003  -
       10789     383    1575     443     254     871    3559    2103  o-
        1546      20     222      34       3      59     679     384  R-
         882      27      46      13      12      58     265     314  oR-
         153       4      50      10       1      14      32      26  Ro-
         128       .      10       5       2       5      26      59  RoR-
          34       3      19       1       .       2       3       3  oo-
          23       2       4       2       .       2       4       6  oRo-
          10       .       1       .       .       .       6       2  RR-
           9       .       .       3       .       1       2       1  oRoR-
           6       1       1       1       .       .       1       2  oRR-
           4       1       .       .       .       .       1       .  RoRo-
           4       .       2       .       .       1       1       .  Roo-
           3       1       1       .       .       .       1       .  ooR-
           1       .       1       .       .       .       .       .  RRR-
           1       .       .       .       .       .       1       .  RoRoR-
           1       .       .       .       .       .       1       .  oRoRo-
      ------  ------  ------  ------  ------  ------  ------  ------  -------

    SUFFIXES

         tot   pha.2   hea.1   cos.2   zod.1   heb.1   str.2   bio.1  pattern
      ------  ------  ------  ------  ------  ------  ------  ------  -------
        9112     420    2306     271     146     557    2533    1319  -oR
        7838     340    1884     356     164     501    2223    1349  -o
        4745      10      26      47      29     583    1769    1859  -Ro
        1258      81     241      98      42     130     302      46  -oRo
         749       .      15      14      15      88     398     100  -RoR
         741      11      96      49      23      64     277     133  -R
         726      33     170      35      35      38     219      29  -oRoR
         299      12      78      20       2      45      58      26  -
         141       8      49       5       3      14      10      21  -oRRo
         117       5      30      13       5       6      40       1  -ooR
          83       9      32       7       .       5      16       .  -oo
          82       7      32       2       4       2      14       4  -oRR
          64       1      22       2       7       5      14       1  -oRoRo
          34       1      10       3       2       4       4       2  -oRRoR
          29       .       4       2       .       2      15       3  -RR
          24       1       2       3       1       7       7       1  -RoRo
          22       .       .       2       1       2       7       8  -RRo
          21       .       4       3       .       2       9       1  -oRoRoR
          20       .       .       1       .       3      11       3  -RoRoR
          10       .       5       .       .       .       1       1  -oRoRR
          10       .       1       1       .       2       3       .  -ooRo
           7       .       1       .       .       .       4       1  -RoRRo
           5       .       1       .       .       1       3       .  -RRR
           4       .       .       1       .       .       2       .  -RoRR
           4       .       .       1       .       2       1       .  -oRRR
           3       .       .       .       .       .       1       2  -RooR
           3       1       .       .       .       .       1       .  -oRRoRo
           3       .       1       1       .       .       .       .  -oRoRRo
           3       .       1       .       .       .       .       .  -oRooR
           2       .       .       .       .       .       1       1  -RRoR
           2       .       .       .       .       .       2       .  -RoRoRo
           2       .       1       .       .       .       1       .  -Roo
           2       .       1       .       .       .       1       .  -ooRoR
           1       .       1       .       .       .       .       .  -RRoRRo
           1       .       .       .       .       1       .       .  -RRoRo
           1       .       .       .       .       .       .       1  -oRRRo
           1       .       .       .       .       .       1       .  -oRoRoRo
           1       .       .       .       .       .       1       .  -oRoRoRoR
           1       .       .       .       .       .       1       .  -oRoRooR
           1       .       1       .       .       .       .       .  -oRooRo
           1       .       .       .       .       .       1       .  -ooRoRo
           1       .       .       .       .       .       .       1  -ooo
           1       .       .       .       .       .       1       .  -oooRo
      ------  ------  ------  ------  ------  ------  ------  ------  -------

  We can see that consecutive R's and consecutive O's are rare, but
  not enough to be classed as errors:
  
    soft words with RR =  409 (4.7% of soft words)
    prefixes with RR =     17 (0.1% of non-empty prefixes)
    suffixes with RR =    349 (1.4% of non-empty suffixes)
    
    soft words with OO =  227 (2.7% of soft words)
    prefixes with OO =     41 (0.3% of non-empty prefixes)
    suffixes with OO =    225 (0.9% of non-empty suffixes)
  
  Words with consecutive RRRs and OOOs are extremely rare.

  These low counts show that the R-letters, like the O-letters, are
  not randomly distributed --- they tend to alternate with the O's.
  This alternation is not simply a consequence of
  mutual repulsion between the O's. Compare for instance the following
  entries from the soft word table:
  
         tot   pha.2   hea.1   cos.2   zod.1   heb.1   str.2   bio.1  pattern
      ------  ------  ------  ------  ------  ------  ------  ------  -------
        3030     136     821     129      34     223     579     605  RoR
          54       4       9       2       2       8      16       6  oRR
          29       .       5       1       .       3       8      10  RRo

         722      34     101      24      17      70     239      98  oRoR
         144       3      37      10       2      15      27      28  RoRo
          47       3       5       4       2       7       7      13  oRRo
  
  If avoidance of OO was the only force acting here, then the
  frequencies of "oRR" and "RRo" should be similar to those of "RoR".
  Ditto for "oRoR", "RoRo", and "oRRo".

  Note that this alternation of R-letters and O-letters confirms that 
  the two classes are qualitatively distinct.
  
Well, enough for now....

All the best,

--stolfi

From jim@mail.rand.org  Mon Jan 24 07:20:50 2000
Return-Path: <jim@mail.rand.org>
Received: from mail-blue.research.att.com (mail-blue.research.att.com [135.207.30.102])
	by fry.research.att.com (980427.SGI.8.8.8/8.8.7) with ESMTP id HAA56736
	for <reeds@fry.research.att.com>; Mon, 24 Jan 2000 07:20:50 -0500 (EST)
Received: by mail-blue.research.att.com (Postfix)
	id 2F81B4CE0D; Mon, 24 Jan 2000 07:20:50 -0500 (EST)
Delivered-To: reeds@research.att.com
Received: from mail01-lax.pilot.net (mail-lax-1.pilot.net [205.139.40.18])
	by mail-blue.research.att.com (Postfix) with ESMTP id 4ED9E4CE09
	for <reeds@research.att.com>; Mon, 24 Jan 2000 07:20:49 -0500 (EST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail01-lax.pilot.net with ESMTP id EAA17456; Mon, 24 Jan 2000 04:20:40 -0800 (PST)
Received: from mail.rand.org (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id EAA22914; Mon, 24 Jan 2000 04:20:39 -0800 (PST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail.rand.org (8.9.3/8.9.3) with ESMTP id EAA07911 for <voynich@rand.org>; Mon, 24 Jan 2000 04:20:25 -0800 (PST)
Received: from mail03-lax.pilot.net (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id EAA22897 for <voynich@rand.org>; Mon, 24 Jan 2000 04:20:25 -0800 (PST)
Received: from mailer3.bham.ac.uk (mailer3.bham.ac.uk [147.188.128.54]) by mail03-lax.pilot.net with ESMTP id EAA17791 for <voynich@rand.org>; Mon, 24 Jan 2000 04:20:23 -0800 (PST)
Received: from bham.ac.uk ([147.188.128.127])
	by mailer3.bham.ac.uk with esmtp (Exim 3.02 #16)
	id 12CiTW-00030h-00
	for voynich@rand.org; Mon, 24 Jan 2000 12:20:06 +0000
Received: from is-fs13.bham.ac.uk ([147.188.128.55])
	by bham.ac.uk with esmtp (Exim 3.10 #1)
	id 12CiTW-0003sQ-00
	for voynich@rand.org; Mon, 24 Jan 2000 12:20:06 +0000
Received: from BHAM-IS-FS13/SpoolDir by is-fs13.bham.ac.uk (Mercury 1.44);
    24 Jan 00 12:20:06 +0000
Received: from SpoolDir by BHAM-IS-FS13 (Mercury 1.44); 24 Jan 00 12:20:00 +0000
Received: from golem (147.188.72.20) by is-fs13.bham.ac.uk (Mercury 1.44) with ESMTP;
    24 Jan 00 12:19:37 +0000
From: "Gabriel Landini" <G.Landini@bham.ac.uk>
Organization: The University of Birmingham, UK.
To: voynich@rand.org
Date: Mon, 24 Jan 2000 12:18:04 -0000
MIME-Version: 1.0
Content-type: text/plain; charset=US-ASCII
Content-transfer-encoding: 7BIT
Subject: Re: LSC and the VMS
Reply-To: G.Landini@bham.ac.uk
In-reply-to: <388B4068.C18A1B8E@nctimes.net>
X-mailer: Pegasus Mail for Win32 (v3.12a)
Message-ID: <D12C453B@is-fs13.bham.ac.uk>
Sender: jim@mail.rand.org
Status: OR

Meaningful vs. meaningless texts... I think we are missing the point.
Meaningful texts can only be written with valid words allowing for a 
few words that one may not know the meaning PLUS some kind of 
valid grammar. 

So monkey character texts can never be meaningful  (at least for 
relatively low orders) because most of the words are not valid 
(although statistically indistinguishable) and there is no underlying 
grammar.

On the other hand, word-monkey texts are meaningless texts to 
which one can assign some "meaning" which arises from 2 facts. 
All words are valid and the n-order word transition probabilities are 
in fact re-constructing in statistical terms the building blocks of the 
grammar structure that can be derived from the original texts.
Of course these are not "meaningful" in the sense of being the 
result of a thinking process, they are just (let's say) statistically 
meaningful. 

So where does this leave us? I think that the construction of 
concordance tables is the first attempt to find some structure larger 
than the words level (of course this is very much related to n-order 
word monkey probability tables). Jorge did this table, but I am not 
sure about how to carry that further on.

Perhaps we can attack the problem from 2 points of view.
1. Look at it statistically at the character level, LSC, long range 
correlations, Hurst exponent, etc. Very nice, but we always seems 
to be getting the same result: the vms is not random, but we're not 
sure whether there is a language. More, the problem we face is that 
we do not have a clue about how language-like gibberish could 
anybody produce.

2. Try to find the building blocks of the underlying grammar. 
I think that  this is crucial to make sure that there is a language to be 
cracked. 

And we also have "meaningful texts" that are not logical, or just 
wrong meaning, but that is another story.

Cheers,

Gabriel



From jim@mail.rand.org  Mon Jan 24 08:31:55 2000
Return-Path: <jim@mail.rand.org>
Received: from mail-blue.research.att.com (mail-blue.research.att.com [135.207.30.102])
	by fry.research.att.com (980427.SGI.8.8.8/8.8.7) with ESMTP id IAA91317
	for <reeds@fry.research.att.com>; Mon, 24 Jan 2000 08:31:54 -0500 (EST)
Received: by mail-blue.research.att.com (Postfix)
	id CF2A24CE09; Mon, 24 Jan 2000 08:31:54 -0500 (EST)
Delivered-To: reeds@research.att.com
Received: from mail02-lax.pilot.net (mail-lax-2.pilot.net [205.139.40.16])
	by mail-blue.research.att.com (Postfix) with ESMTP id 5596B4CE08
	for <reeds@research.att.com>; Mon, 24 Jan 2000 08:31:54 -0500 (EST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail02-lax.pilot.net with ESMTP id FAA17372; Mon, 24 Jan 2000 05:31:47 -0800 (PST)
Received: from mail.rand.org (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id FAA24480; Mon, 24 Jan 2000 05:31:46 -0800 (PST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail.rand.org (8.9.3/8.9.3) with ESMTP id FAA09670 for <voynich@rand.org>; Mon, 24 Jan 2000 05:31:24 -0800 (PST)
Received: from mail01-lax.pilot.net (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id FAA24383 for <voynich@rand.org>; Mon, 24 Jan 2000 05:31:23 -0800 (PST)
Received: from grande.dcc.unicamp.br (grande.dcc.unicamp.br [143.106.7.8]) by mail01-lax.pilot.net with ESMTP id FAA24028 for <voynich@rand.org>; Mon, 24 Jan 2000 05:27:30 -0800 (PST)
Received: from amazonas.dcc.unicamp.br (amazonas.dcc.unicamp.br [143.106.7.11])
	by grande.dcc.unicamp.br (8.9.3/8.9.3) with ESMTP id LAA01884
	for <voynich@rand.org>; Mon, 24 Jan 2000 11:26:49 -0200 (EDT)
Received: from coruja.dcc.unicamp.br (coruja.dcc.unicamp.br [143.106.24.80])
	by amazonas.dcc.unicamp.br (8.8.5/8.8.5) with ESMTP id LAA18994
	for <voynich@rand.org>; Mon, 24 Jan 2000 11:26:49 -0200 (EDT)
Received: (from stolfi@localhost)
	by coruja.dcc.unicamp.br (8.8.5/8.8.5) id LAA04251;
	Mon, 24 Jan 2000 11:26:48 -0200 (EDT)
Date: Mon, 24 Jan 2000 11:26:48 -0200 (EDT)
Message-Id: <200001241326.LAA04251@coruja.dcc.unicamp.br>
From: Jorge Stolfi <stolfi@dcc.unicamp.br>
To: voynich@rand.org
Subject: Tibetan
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
Content-Type: text/plain; charset=iso-8859-1
Reply-To: stolfi@dcc.unicamp.br
Sender: jim@mail.rand.org
Status: OR


    | bsnams nas mnyan yod kyi grong khyer chen por bsod snyoms kyi
    | phyir zhugs so, ,de nas bcom ldan 'das mnyan yod kyi grong khyer
    | chen por bsod snyoms kyi phyir gshegs nas bsod
    | 
    | snyoms kyi zhal zas mjug tu gsol te zas kyi bya ba mdzad nas zas
    | phyi ma'i bsod snyoms spangs pas, lhung bzed dang chos gos bzhag
    | nas zhabs bsil te gdan bshams pa la
    | 
    | skyil mo krung bcas nas sku drang por bsrang ste dran pa mngon
    | du bzhag nas bzhugs so, ,de nas dge slong mang po bcom ldan 'das
    | ga la ba der dong ste lhags nas bcom ldan 'das
    | 
    | kyi zhabs la mgo bos phyag 'tsal te bcom ldan 'das la lan gsum
    | bskor ba byas nas phyogs gcig tu 'khod do, ,yang de'i tse tse
    | dang ldan pa rab 'byor 'khor de nyid du 'dus


That is a sample of Tibetan in Roman transcription, from the "Diamond
Cutter Sutra" (ca. 500 BC). The full text can be found at
http://worldtrans.org/CyberSangha/findex.html , file COMPLETE/KD0016F.ZIP

I could not find much material about the Tibetan language and its
spelling, not even in our Linguistics Dept. library. (That may give
you an idea of the general level of things down here...). Crystal's
"Encyclopedia of Language" has only a brief note; Comrie's "The
World's Major Languages" omits it altogether (hopefully it will be in
the coming sequel, "The World's Sargeant Languages" 8-).

Anyway, here is all I think I know:

    Tibetan is genetically related to the Chinese "dialects", to
    Burmese, and to a few other languages of East Asia. It is a
    monosyllabic, tonal language; I presume that, like Chinese, is has
    no articles, no word inflections (hence no clear-cut lexical
    categories), and generally omits the verb "to be".

    The Tibetan script is very ancient; it is derived from an old
    Indian alphabet, and therefore is somewhat similar to Sanskrit's
    Devanagari and modern Hindi scripts. The standard spelling too is
    very old, so it is now quite inconsistent with the spoken
    language.

    I believe that the sample above is not a phonetic transcription,
    but only a transliteration of the native spelling into Roman
    letters. That would explain the unpronounceable clusters like
    "bsk" and "mdz": some of those letters are not sounds, but coded
    indications of syllable tone. From indirect evidence, I guess that
    many of the word-initial (and possibly word-final) r,b,m,g,s are
    silent tone marks.
  
I would be grateful for any additional information on Tibetan and
its spelling system. 

Meanwhile, let me argue the case for Voynichese = Tibetan.

Historically, that theory seem quite possible, since there have been
many European travelers to Tibet, before and after Marco Polo; and
presumably there were many Tibetan travelers to Europe as well.

Although Tibetan had a standard spelling, the reportedly great
distance between spelling and pronunciation could have motivated the
author to invent a new alphabet and spelling. (This motivation would
be much weaker for Arabic, for instance, whose standard spelling would
probably seem quite adequate to anyone who bothered to learn the
language.)

As I observed before, the word length distribution and word structure
of the VMS strongly suggest that the "words" are actually syllables,
and threfore that Voynichese is a syllabic language -- which fits
Tibetan as well as Chinese. The lack of discernible grammar and the
frequent word repetitions are at least consistent with that theory.
(Note the three "zas" on line 4, and the "tse tse" on line 11.)

Finally, while I won't be fool enough to propose a mapping from
Tibetan to Voynichese, I will say that the use of prefixed and
suffixed letters to indicate tone, which would seem `natural' to a
Tibetan, could conceivably produce the kind of "word" structure that
we see in the VMS.


All the best,

--stolfi

From jim@mail.rand.org  Mon Jan 24 11:32:35 2000
Return-Path: <jim@mail.rand.org>
Received: from mail-blue.research.att.com (mail-blue.research.att.com [135.207.30.102])
	by fry.research.att.com (980427.SGI.8.8.8/8.8.7) with ESMTP id LAA89346
	for <reeds@fry.research.att.com>; Mon, 24 Jan 2000 11:32:34 -0500 (EST)
Received: by mail-blue.research.att.com (Postfix)
	id 0743D4CE10; Mon, 24 Jan 2000 11:32:34 -0500 (EST)
Delivered-To: reeds@research.att.com
Received: from mail01-lax.pilot.net (mail-lax-1.pilot.net [205.139.40.18])
	by mail-blue.research.att.com (Postfix) with ESMTP id 657134CE09
	for <reeds@research.att.com>; Mon, 24 Jan 2000 11:32:33 -0500 (EST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail01-lax.pilot.net with ESMTP id IAA19277; Mon, 24 Jan 2000 08:32:29 -0800 (PST)
Received: from mail.rand.org (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id IAA06776; Mon, 24 Jan 2000 08:32:28 -0800 (PST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail.rand.org (8.9.3/8.9.3) with ESMTP id IAA28704 for <voynich@rand.org>; Mon, 24 Jan 2000 08:30:25 -0800 (PST)
From: RSRICHMOND@aol.com
Received: from mail02-lax.pilot.net (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id IAA06603 for <voynich@rand.org>; Mon, 24 Jan 2000 08:30:23 -0800 (PST)
Received: from imo18.mx.aol.com (imo18.mx.aol.com [152.163.225.8]) by mail02-lax.pilot.net with ESMTP id IAA13044 for <voynich@rand.org>; Mon, 24 Jan 2000 08:30:23 -0800 (PST)
Received: from RSRICHMOND@aol.com
	by imo18.mx.aol.com (mail_out_v24.6.) id 6.13.6cb2eb (4331)
	 for <voynich@rand.org>; Mon, 24 Jan 2000 11:29:50 -0500 (EST)
Message-ID: <13.6cb2eb.25bdd7fe@aol.com>
Date: Mon, 24 Jan 2000 11:29:50 EST
Subject: Re:  Tibetan
To: voynich@rand.org
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
X-Mailer: AOL 3.0.1 for Mac sub 78
Sender: jim@mail.rand.org
Status: OR

Tibetan has a strongly historical orthography! Sinicists apparently believe 
that the tongue-boggling consonant clusters fore and aft a lone vowel were 
originally pronounced, and that primary tone arose as clusters became simpler 
(in somewhat the same way that phonemic vowel length is arising in English 
with loss of final clusters, so that the vowel of /he:p/ 'help' may contrast 
with /step/). - But none of the Tibetan consonants are tone markers - I don't 
think the orthography marks tone, which I believe bears some but not very 
much contrast load in contemporary Tibetan.
I believe that the use of final consonants to mark tone in Romanizations is a 
20th century invention - Mary Haas used it for Burmese in the 1940's, Smalley 
for Hmong in the '50's. 

Bob Richmond
Knoxville TN

From jim@mail.rand.org  Mon Jan 24 17:01:52 2000
Return-Path: <jim@mail.rand.org>
Received: from mail-green.research.att.com (mail-green.research.att.com [135.207.30.103])
	by fry.research.att.com (980427.SGI.8.8.8/8.8.7) with ESMTP id RAA53874
	for <reeds@fry.research.att.com>; Mon, 24 Jan 2000 17:01:52 -0500 (EST)
Received: by mail-green.research.att.com (Postfix)
	id 6CA491E064; Mon, 24 Jan 2000 17:01:52 -0500 (EST)
Delivered-To: reeds@research.att.com
Received: from mail03-lax.pilot.net (mail-lax-3.pilot.net [205.139.40.17])
	by mail-green.research.att.com (Postfix) with ESMTP id DCEBB1E021
	for <reeds@research.att.com>; Mon, 24 Jan 2000 17:01:51 -0500 (EST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail03-lax.pilot.net with ESMTP id OAA25667; Mon, 24 Jan 2000 14:01:41 -0800 (PST)
Received: from mail.rand.org (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id OAA06664; Mon, 24 Jan 2000 14:01:38 -0800 (PST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail.rand.org (8.9.3/8.9.3) with ESMTP id OAA14313 for <voynich@rand.org>; Mon, 24 Jan 2000 14:01:00 -0800 (PST)
Received: from mail03-lax.pilot.net (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id OAA06576 for <voynich@rand.org>; Mon, 24 Jan 2000 14:00:58 -0800 (PST)
Received: from mailout04.sul.t-online.de (mailout04.sul.t-online.de [194.25.134.18]) by mail03-lax.pilot.net with ESMTP id OAA25194 for <voynich@rand.org>; Mon, 24 Jan 2000 14:00:57 -0800 (PST)
Received: from fwd01.sul.t-online.de 
	by mailout04.sul.t-online.de with smtp 
	id 12CrXc-0000ah-03; Mon, 24 Jan 2000 23:00:56 +0100
Received: from  (0625764225-0001@[193.159.141.174]) by fwd01.sul.t-online.de
	with smtp id 12CrXU-0oye8mC; Mon, 24 Jan 2000 23:00:48 +0100
From: Zandbergen@t-online.de (Rene)
To: voynich@rand.org
References: <200001221227.KAA02811@coruja.dcc.unicamp.br>
	 <12CPS7-0TZJ44C@fwd02.sul.t-online.de> <388B4068.C18A1B8E@nctimes.net>
Subject: meaning-less/full
X-Mailer: T-Online eMail 2.3
MIME-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 8BIT
Date: Mon, 24 Jan 2000 23:00:48 +0100
Message-ID: <12CrXU-0oye8mC@fwd01.sul.t-online.de>
X-Sender: 0625764225-0001@t-dialin.net
Sender: jim@mail.rand.org
Status: OR

Mark wrote:

>  Rene, if you have in mind a rigorous mathematical definition of meaningful vs
> meaningless, it may be indeed a hard task (but certainly possible). However,
> for the beginning we just may be satisfied with the fact that when we see a
> meaningful text we recognize it as such (if we know the language). Of course,
> when you deal with an unknown language, the situation is different, but what
>  we
> can do is to test various forms of meaningless texts, compare them to a
>  variety
> of meaningful exts in various languages, and try to determine common features
> in each of those two types of texts. LSC can probably be useful as one of the
> tools which though better has to be complemented by other tools.

A mathematical definition doesn't seem feasible.  And yes, I have got some
intuitive feeling for what's meaningful and what isn't. But the more I
think of it, the more I realise that it isn't enough. Why? If we want to
check whether the LSC correctly identifies a text as meaningful or meaningless,
we must know whether the text is or not, and then check the LSC curve to
see if it matches or not. All your examples are very clear cut.
(Except one: the VMs, but about that later).

What are the doubtful cases? 

- Think of Jacques' telegram-style recipes. This is more towards the
meaningful side, but it could have someone very puzzled.
- Or a text in which every third word has been struck out. It is un-
grammatical but depending on the complexity of the original text, it
may be right in the middle between meaningful and meaningless.
- Or take a text in which every character has been replaced by the
next one in the alphabet. Totally meaningless. Yet the LSC defines it
as fully meaningful. And it becomes meaningful once you know the trick
and learn to read it (half an hour's practice might be enough to develop
a good reading speed).
- Probably (definitely) much, much more.

Is 'bogorodice djevo raduysia' Russian or the result of a Russian character
monkey? Before last Xmas I wouldn't have know but the LSC could have told
me (given more text, of course). And this gets us to the question of the 
VMs. That is as readable to me as Russian. In fact, it is more readable
to me than Arabic. The LSC classifies it as meaningful, and all the
experiments Mark has done help to reinforce the conclusion.
But could it be in the grey area above?

The point: we're not sure what we're measuring. And that isn't the 
first time in the history of the VMs, to put it mildly.
Still, as an engineer, I feel that it shouldn't stop us from experimenting.

Cheers, Rene

From jim@mail.rand.org  Mon Jan 24 19:06:45 2000
Return-Path: <jim@mail.rand.org>
Received: from mail-blue.research.att.com (mail-blue.research.att.com [135.207.30.102])
	by fry.research.att.com (980427.SGI.8.8.8/8.8.7) with ESMTP id TAA81280
	for <reeds@fry.research.att.com>; Mon, 24 Jan 2000 19:06:45 -0500 (EST)
Received: by mail-blue.research.att.com (Postfix)
	id E37344CE1E; Mon, 24 Jan 2000 19:06:44 -0500 (EST)
Delivered-To: reeds@research.att.com
Received: from mail01-lax.pilot.net (mail-lax-1.pilot.net [205.139.40.18])
	by mail-blue.research.att.com (Postfix) with ESMTP id 3327C4CE10
	for <reeds@research.att.com>; Mon, 24 Jan 2000 19:06:44 -0500 (EST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail01-lax.pilot.net with ESMTP id QAA26569; Mon, 24 Jan 2000 16:06:17 -0800 (PST)
Received: from mail.rand.org (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id QAA16250; Mon, 24 Jan 2000 16:06:14 -0800 (PST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail.rand.org (8.9.3/8.9.3) with ESMTP id QAA29276 for <voynich@rand.org>; Mon, 24 Jan 2000 16:05:59 -0800 (PST)
Received: from mail01-lax.pilot.net (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id QAA16200 for <voynich@rand.org>; Mon, 24 Jan 2000 16:05:57 -0800 (PST)
Received: from grande.dcc.unicamp.br (grande.dcc.unicamp.br [143.106.7.8]) by mail01-lax.pilot.net with ESMTP id QAA26376 for <voynich@rand.org>; Mon, 24 Jan 2000 16:05:54 -0800 (PST)
Received: from amazonas.dcc.unicamp.br (amazonas.dcc.unicamp.br [143.106.7.11])
	by grande.dcc.unicamp.br (8.9.3/8.9.3) with ESMTP id WAA16942
	for <voynich@rand.org>; Mon, 24 Jan 2000 22:05:25 -0200 (EDT)
Received: from coruja.dcc.unicamp.br (coruja.dcc.unicamp.br [143.106.24.80])
	by amazonas.dcc.unicamp.br (8.8.5/8.8.5) with ESMTP id WAA15729
	for <voynich@rand.org>; Mon, 24 Jan 2000 22:05:23 -0200 (EDT)
Received: (from stolfi@localhost)
	by coruja.dcc.unicamp.br (8.8.5/8.8.5) id WAA06066;
	Mon, 24 Jan 2000 22:05:23 -0200 (EDT)
Date: Mon, 24 Jan 2000 22:05:23 -0200 (EDT)
Message-Id: <200001250005.WAA06066@coruja.dcc.unicamp.br>
From: Jorge Stolfi <stolfi@dcc.unicamp.br>
To: voynich@rand.org
Subject: Re: meaning-less/full
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
Content-Type: text/plain; charset=iso-8859-1
In-Reply-To: <12CrXU-0oye8mC@fwd01.sul.t-online.de>
References: <200001221227.KAA02811@coruja.dcc.unicamp.br>
	<12CPS7-0TZJ44C@fwd02.sul.t-online.de>
	<388B4068.C18A1B8E@nctimes.net>
	<12CrXU-0oye8mC@fwd01.sul.t-online.de>
Reply-To: stolfi@dcc.unicamp.br
Sender: jim@mail.rand.org
Status: OR


    > A mathematical definition doesn't seem feasible.
    
Right. Meaning is not a property of the message alone. In fact,
information theory says that a stream of messages with maximum meaning
will be indistinguishable from a stream of perfectly random strings.
Any deviation from uniform probabilities, or any correlation between
bits, means a waste in channel capacity.

A message has meaning only to the extent that it mirrors some
information that the sender wishes to send. Therefore, in order to
define meaning, one must specify the information to be sent, and the
encoding algorithm; then one can analyze how much of that information
is preserved by the encoding.

Consider that an ideal text compression algorithm should take
"typical" texts and turn them into random-looking strings of bits. Of
course this transformation preserves meaning (as long as one has the
decompression algorithm!); but, for maximum compression, the program
should equalize the bit probabilities and remove any correlations.
Modern compressors like PKZIP go a long way in that direction. The
compressed text, being shorter than the original, will actually have
more meaning per unit length; but it will look like perfect gibberish to
LSC-like tests.

Or, consider a meaningful plaintext XORed with the binary expansion of
pi. The result will have uniform bit probabilities, and no visible
correlations; but it will still carry the original meaning, which can
be easily recovered. It would take a very sophisticated algorithm (one
that knows that pi is a "special" number) to notice that the text is
not an entirely random string of bits.

So the LSC and possible variants are not tests of `meaning' but rather
of `naturalness.' They work because natural language uses its medium
rather inefficiently, but in a rather peculiar way: it uses symbols
with unequal frequencies (a feature that mechanical monkeys can
imitate), but changes those frequencies over long distances (something
which simple monkeys won't do).

However, with slightly smarter monkeys one *can* generate meaningless
texts that fool the LSC; and the same applies for any "meaning
detector" that looks only at the message. Conversely, one can always
encode a meaninful text so as to make it look "random" to the LSC. In
short, a naturally produced (and natural-looking) text can be quite
meaningless, while a meaningful text may be (and look) quite
unnatural.

All the best,

--stolfi

From jim@mail.rand.org  Tue Jan 25 03:37:53 2000
Return-Path: <jim@mail.rand.org>
Received: from mail-green.research.att.com (mail-green.research.att.com [135.207.30.103])
	by fry.research.att.com (980427.SGI.8.8.8/8.8.7) with ESMTP id DAA98950
	for <reeds@fry.research.att.com>; Tue, 25 Jan 2000 03:37:53 -0500 (EST)
Received: by mail-green.research.att.com (Postfix)
	id 349F01E00B; Tue, 25 Jan 2000 03:37:53 -0500 (EST)
Delivered-To: reeds@research.att.com
Received: from mail01-lax.pilot.net (mail-lax-1.pilot.net [205.139.40.18])
	by mail-green.research.att.com (Postfix) with ESMTP id B33CE1E009
	for <reeds@research.att.com>; Tue, 25 Jan 2000 03:37:52 -0500 (EST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail01-lax.pilot.net with ESMTP id AAA03877; Tue, 25 Jan 2000 00:37:48 -0800 (PST)
Received: from mail.rand.org (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id AAA03992; Tue, 25 Jan 2000 00:37:47 -0800 (PST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail.rand.org (8.9.3/8.9.3) with ESMTP id AAA23436 for <voynich@rand.org>; Tue, 25 Jan 2000 00:37:17 -0800 (PST)
Received: from mail01-lax.pilot.net (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id AAA03963 for <voynich@rand.org>; Tue, 25 Jan 2000 00:37:16 -0800 (PST)
Received: from mail.nctimes.net (mail.nctimes.net [208.239.16.66]) by mail01-lax.pilot.net with ESMTP id AAA03838 for <voynich@rand.org>; Tue, 25 Jan 2000 00:37:16 -0800 (PST)
Received: from nctimes.net ([208.239.20.9]) by mail.nctimes.net
          (Netscape Messaging Server 3.5)  with ESMTP id AAAA6E;
          Tue, 25 Jan 2000 00:33:33 -0800
Message-ID: <388D605B.5653083A@nctimes.net>
Date: Tue, 25 Jan 2000 00:35:39 -0800
From: Mark Perakh <perakh@nctimes.net>
Reply-To: perakh@nctimes.net
Organization: home
X-Mailer: Mozilla 4.7 [en] (Win98; I)
X-Accept-Language: en,ru
MIME-Version: 1.0
To: Rene <Zandbergen@t-online.de>
Cc: voynich@rand.org
Subject: Re: meaning-less/full
References: <200001221227.KAA02811@coruja.dcc.unicamp.br>
		 <12CPS7-0TZJ44C@fwd02.sul.t-online.de> <388B4068.C18A1B8E@nctimes.net> <12CrXU-0oye8mC@fwd01.sul.t-online.de>
Content-Type: multipart/alternative;
 boundary="------------8DBFACE760584997E8E32D8D"
Sender: jim@mail.rand.org
Status: OR


--------------8DBFACE760584997E8E32D8D
Content-Type: text/plain; charset=koi8-r
Content-Transfer-Encoding: 7bit



Rene wrote:

> What are the doubtful cases?
>
> - Think of Jacques' telegram-style recipes. This is more towards the
> meaningful side, but it could have someone very puzzled.
> - Or a text in which every third word has been struck out. It is un-
> grammatical but depending on the complexity of the original text, it
> may be right in the middle between meaningful and meaningless.

Rene and Jorge: You my have noticed that in our work with  Brendan we had
systematically measured LSC in meaningful texts from which either all
vowels or all consonants were removed.  In all cases explored, the PMP shifted to
lower n and DOM decreased but the LSC curve looked typical
of meaningful texts even though it was an extreme case of your example with every
third word struck out. I consider such vowelless or consonantless
texts a form of abbreviation so in my classification such texts are meaningful,
although it may be hard to make sense of them, especially the consonantless
ones. In another example, I took a meaningful text and converted it into what, among
the debunkers of the alleged Bible code, is called a skip-text.(a short example of
such was descibed briefly in paper 2 on LSC).
Subsequently one of the debunkers, Randy Ingermanson analyzed skip texts quite
thoroughly using a different statistics from LSC (partially based on chi-square). .
Despite the fact that skip texts preserve the entropy of the original texts (as they
are obtained by rearranging the letters or n-grams of the original text according to
a regular non-random reversible  procedure) Randy found that they largely display
statistics of meaningless texts, which fits the fact that we cannot make sense of
them if we don't know that they are skip-texts and what is the value of the skip.
Randy's results are described in his book "Who wrote the Bible code."  That book has
a very good statistical appendix on Randy's web page. On the other hand, once I
asked my son to encode the Song of Hiawatha without telling me how he encoded it.
His encoding obviously
involved a high rate of compression.  LSC of the encoded texts was like that of a
gibberish. Skip-texts are also actually texts encoded in a simple way, and they
preserve exactly the length of the text.  My son's encoding was rather complex and
shortened the texts.  I did not know what to make of it, so I did not pursue the
study of encoded texts by LSC.  It can be surmized that such a study could shed some
light on the mathematically definable distinction between meaningful texts and
gibberish.
Another story is what we did with Brendan. He would email to me LSC data for various
unknown to me texts, which he created by all kinds of reshuffling alphabets, etc,
and
my task was to guess, first whether the text was meaningful or not, and if not, how
it was made up.  To the first question I answered correctly in all cases, and to the
second with a reasonable rate of success. This served for us as a proof of
objecttivity of LSC data as well as our reasonable understandng of it.  Therefore,
while I agree with you that a math definition of meaningfulness is a bird which is
hard to catch, I still have a feeling that some reasonable criterion could be
formulated on the base of the empirical accumulation of data.  We probably could
define a crtiterion of meaningfulness as some number combining certain experimental
quanitative characteristics, which number would be within certain limits for all the
studied texts we recognize as meaningful and beyond those limits for all the studied
texts we recognize as gibberish.  Such a crtiterion would be imperfect but still
useful.  Of course  I am saying all that because of my background and experience of
a physicist rather than of a mathematician or a linguist. I apologize for the
disordered discussion, it is just raw ideas which all of you can dismiss if you feel
so.  .

>
> - Or take a text in which every character has been replaced by the
> next one in the alphabet. Totally meaningless.

No, it is not (see below).

> Yet the LSC defines it
> as fully meaningful.

Yes, because it is indeed meaningful, just using an alphabet where symbol B means
sound A etc. In this case LSC truthfully reports what the text actually is.

> Is 'bogorodice djevo raduysia' Russian or the result of a Russian character
> monkey? Before last Xmas I wouldn't have know but the LSC could have told
> me (given more text, of course).

Of course LSC would tell you. Meaningfulness, in my view, is not about whether or
not a reader can understand it but about whether or not the writer wrote something
meaningful from the writer's viewpoint. The question whether or not  we can
understand a text has nothing to do with its being meaningful.  LSC (and I am not at
all prone to vouch for its versatility or absolute reliability) discerns
meaningfulness regardless of language which can be utterly unknown to us.

> And this gets us to the question of the
> VMs. That is as readable to me as Russian. In fact, it is more readable
> to me than Arabic. The LSC classifies it as meaningful, and all the
> experiments Mark has done help to reinforce the conclusion.
> But could it be in the grey area above?

It certainly can be in a grey or in a striped or in a dappled area. That is why I
suggest that LSC is just one of many possible tools and the more tools are used, the
more we can hope to know if it is grey, or black-and-white.

>
>
> The point: we're not sure what we're measuring. And that isn't the
> first time in the history of the VMs, to put it mildly.
> Still, as an engineer, I feel that it shouldn't stop us from experimenting.

Yes, because it is fun.  Best!  Mark


--------------8DBFACE760584997E8E32D8D
Content-Type: text/html; charset=koi8-r
Content-Transfer-Encoding: 7bit

<!doctype html public "-//w3c//dtd html 4.0 transitional//en">
<html>
&nbsp;
<p>Rene wrote:
<blockquote TYPE=CITE>What are the doubtful cases?
<p>- Think of Jacques' telegram-style recipes. This is more towards the
<br>meaningful side, but it could have someone very puzzled.
<br>- Or a text in which every third word has been struck out. It is un-
<br>grammatical but depending on the complexity of the original text, it
<br>may be right in the middle between meaningful and meaningless.</blockquote>
Rene and Jorge: You my have noticed that in our work with&nbsp; Brendan
we had systematically measured LSC in meaningful texts from which either
all
<br>vowels or all consonants were removed.&nbsp; In all cases explored,
the PMP shifted to lower n and DOM decreased but the LSC curve looked typical
<br>of meaningful texts even though it was an extreme case of your example
with every third word struck out. I consider such vowelless or consonantless
<br>texts a form of abbreviation so in my classification such texts are
meaningful, although it may be hard to make sense of them, especially the
consonantless
<br>ones. In another example, I took a meaningful text and converted it
into what, among the debunkers of the alleged Bible code, is called a skip-text.(a
short example of such was descibed briefly in paper 2 on LSC).
<br>Subsequently one of the debunkers, Randy Ingermanson analyzed skip
texts quite thoroughly using a different statistics from LSC (partially
based on chi-square). .&nbsp; Despite the fact that skip texts preserve
the entropy of the original texts (as they are obtained by rearranging
the letters or n-grams of the original text according to a regular non-random
reversible&nbsp; procedure) Randy found that they largely display statistics
of meaningless texts, which fits the fact that we cannot make sense of
them if we don't know that they are skip-texts and what is the value of
the skip. Randy's results are described in his book "Who wrote the Bible
code."&nbsp; That book has a very good statistical appendix on Randy's
web page. On the other hand, once I asked my son to encode the Song of
Hiawatha without telling me how he encoded it.&nbsp; His encoding obviously
<br>involved a high rate of compression.&nbsp; LSC of the encoded texts
was like that of a gibberish. Skip-texts are also actually texts encoded
in a simple way, and they preserve exactly the length of the text.&nbsp;
My son's encoding was rather complex and shortened the texts.&nbsp; I did
not know what to make of it, so I did not pursue the study of encoded texts
by LSC.&nbsp; It can be surmized that such a study could shed some light
on the mathematically definable distinction between meaningful texts and
gibberish.
<br>Another story is what we did with Brendan. He would email to me LSC
data for various unknown to me texts, which he created by all kinds of
reshuffling alphabets, etc, and
<br>my task was to guess, first whether the text was meaningful or not,
and if not, how it was made up.&nbsp; To the first question I answered
correctly in all cases, and to the second with a reasonable rate of success.
This served for us as a proof of objecttivity of LSC data as well as our
reasonable understandng of it.&nbsp; Therefore, while I agree with you
that a math definition of meaningfulness is a bird which is hard to catch,
I still have a feeling that some reasonable criterion could be formulated
on the base of the empirical accumulation of data.&nbsp; We probably could
define a crtiterion of meaningfulness as some number combining certain
experimental quanitative characteristics, which number would be within
certain limits for all the studied texts we recognize as meaningful and
beyond those limits for all the studied texts we recognize as gibberish.&nbsp;
Such a crtiterion would be imperfect but still useful.&nbsp; Of course&nbsp;
I am saying al<font face="">l that because of my background and experience
of a physicist rather than of a mathematician or a linguist.</font> I apologize
for the disordered discussion, it is just raw ideas which all of you can
dismiss if you feel so.&nbsp; .
<blockquote TYPE=CITE>&nbsp;
<br>- Or take a text in which every character has been replaced by the
<br>next one in the alphabet. Totally meaningless.</blockquote>
No, it is not (see below).
<blockquote TYPE=CITE>Yet the LSC defines it
<br>as fully meaningful.</blockquote>
Yes, because it is indeed meaningful, just using an alphabet where symbol
B means sound A etc. In this case LSC truthfully reports what the text
actually is.
<blockquote TYPE=CITE>Is 'bogorodice djevo raduysia' Russian or the result
of a Russian character
<br>monkey? Before last Xmas I wouldn't have know but the LSC could have
told
<br>me (given more text, of course).</blockquote>
Of course LSC would tell you. Meaningfulness, in my view, is not about
whether or not a reader can understand it but about whether or not the
writer wrote something
<br>meaningful from the writer's viewpoint. The question whether or not&nbsp;
we can understand a text has nothing to do with its being meaningful.&nbsp;
LSC (and I am not at all prone to vouch for its versatility or absolute
reliability) discerns meaningfulness regardless of language which can be
utterly unknown to us.
<blockquote TYPE=CITE>And this gets us to the question of the
<br>VMs. That is as readable to me as Russian. In fact, it is more readable
<br>to me than Arabic. The LSC classifies it as meaningful, and all the
<br>experiments Mark has done help to reinforce the conclusion.
<br>But could it be in the grey area above?</blockquote>
It certainly can be in a grey or in a striped or in a dappled area. That
is why I suggest that LSC is just one of many possible tools and the more
tools are used, the more we can hope to know if it is grey, or black-and-white.
<blockquote TYPE=CITE>&nbsp;
<p>The point: we're not sure what we're measuring. And that isn't the
<br>first time in the history of the VMs, to put it mildly.
<br>Still, as an engineer, I feel that it shouldn't stop us from experimenting.</blockquote>
Yes, because it is fun.&nbsp; Best!&nbsp; Mark
<br>&nbsp;</html>

--------------8DBFACE760584997E8E32D8D--

From jim@mail.rand.org  Tue Jan 25 11:24:38 2000
Return-Path: <jim@mail.rand.org>
Received: from mail-blue.research.att.com (mail-blue.research.att.com [135.207.30.102])
	by fry.research.att.com (980427.SGI.8.8.8/8.8.7) with ESMTP id LAA95964
	for <reeds@fry.research.att.com>; Tue, 25 Jan 2000 11:24:38 -0500 (EST)
Received: by mail-blue.research.att.com (Postfix)
	id 862B84CE23; Tue, 25 Jan 2000 11:24:38 -0500 (EST)
Delivered-To: reeds@research.att.com
Received: from mail01-lax.pilot.net (mail-lax-1.pilot.net [205.139.40.18])
	by mail-blue.research.att.com (Postfix) with ESMTP id 027324CE1C
	for <reeds@research.att.com>; Tue, 25 Jan 2000 11:24:37 -0500 (EST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail01-lax.pilot.net with ESMTP id IAA24561; Tue, 25 Jan 2000 08:24:33 -0800 (PST)
Received: from mail.rand.org (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id IAA17623; Tue, 25 Jan 2000 08:24:31 -0800 (PST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail.rand.org (8.9.3/8.9.3) with ESMTP id IAA09276 for <voynich@rand.org>; Tue, 25 Jan 2000 08:24:03 -0800 (PST)
Received: from mail01-lax.pilot.net (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id IAA17521 for <voynich@rand.org>; Tue, 25 Jan 2000 08:24:02 -0800 (PST)
Received: from mailer3.bham.ac.uk (mailer3.bham.ac.uk [147.188.128.54]) by mail01-lax.pilot.net with ESMTP id IAA24286 for <voynich@rand.org>; Tue, 25 Jan 2000 08:24:01 -0800 (PST)
Received: from bham.ac.uk ([147.188.128.127])
	by mailer3.bham.ac.uk with esmtp (Exim 3.02 #16)
	id 12D8l0-0006Hz-00
	for voynich@rand.org; Tue, 25 Jan 2000 16:23:54 +0000
Received: from is-fs13.bham.ac.uk ([147.188.128.55])
	by bham.ac.uk with esmtp (Exim 3.10 #1)
	id 12D8l0-00000Q-01
	for voynich@rand.org; Tue, 25 Jan 2000 16:23:54 +0000
Received: from BHAM-IS-FS13/SpoolDir by is-fs13.bham.ac.uk (Mercury 1.44);
    25 Jan 00 16:23:55 +0000
Received: from SpoolDir by BHAM-IS-FS13 (Mercury 1.44); 25 Jan 00 16:23:46 +0000
Received: from golem (147.188.72.20) by is-fs13.bham.ac.uk (Mercury 1.44) with ESMTP;
    25 Jan 00 16:23:36 +0000
From: "Gabriel Landini" <G.Landini@bham.ac.uk>
Organization: The University of Birmingham, UK.
To: voynich@rand.org
Date: Tue, 25 Jan 2000 16:22:01 -0000
MIME-Version: 1.0
Content-type: text/plain; charset=US-ASCII
Content-transfer-encoding: 7BIT
Subject: Human random number generators
Reply-To: G.Landini@bham.ac.uk
X-mailer: Pegasus Mail for Win32 (v3.12a)
Message-ID: <13AE84E48@is-fs13.bham.ac.uk>
Sender: jim@mail.rand.org
Status: OR

Hi all,
The other day I promised a reference. The paper is by Schenkel, 
Zhang & Zhang (1993) Long range correlations in human writings. 
Fractals Vol. 1 No.1: 47-57.

The paper is not particularly good I think that there is a problem in 
the way of coding letters:
Each letter is arbitrarily coded as a binary number of 5 bits. The 
sequence of 0's and 1's is converted to a random walk by 
assigning increments of  +1 or -1 depending whether the current bit 
is 1 or 0.
My criticism is that if one assigned "11111" to a common letter then 
there will be a bias on the random walk, but this would not show up 
as strong if the same letter was coded (let's say) as "10101".
The authors argue that those correlations would not show up 
beyond the word level. I am not sure that it would be the case and 
they do not show any evidence for that either.

They show also long correlations in a human-written sequence of 
"random" numbers.

Their conclusions are fairly inconclusive.

Cheers,

Gabriel


From jim@mail.rand.org  Tue Jan 25 04:39:18 2000
Return-Path: <jim@mail.rand.org>
Received: from mail-green.research.att.com (mail-green.research.att.com [135.207.30.103])
	by fry.research.att.com (980427.SGI.8.8.8/8.8.7) with ESMTP id EAA95077
	for <reeds@fry.research.att.com>; Tue, 25 Jan 2000 04:39:18 -0500 (EST)
Received: by mail-green.research.att.com (Postfix)
	id 3E5E01E004; Tue, 25 Jan 2000 04:39:18 -0500 (EST)
Delivered-To: reeds@research.att.com
Received: from mail02-lax.pilot.net (mail-lax-2.pilot.net [205.139.40.16])
	by mail-green.research.att.com (Postfix) with ESMTP id ACB551E002
	for <reeds@research.att.com>; Tue, 25 Jan 2000 04:39:17 -0500 (EST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail02-lax.pilot.net with ESMTP id BAA19799; Tue, 25 Jan 2000 01:39:14 -0800 (PST)
Received: from mail.rand.org (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id BAA04924; Tue, 25 Jan 2000 01:39:13 -0800 (PST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail.rand.org (8.9.3/8.9.3) with ESMTP id BAA24304 for <voynich@rand.org>; Tue, 25 Jan 2000 01:38:58 -0800 (PST)
Received: from mail03-lax.pilot.net (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id BAA04904 for <voynich@rand.org>; Tue, 25 Jan 2000 01:38:56 -0800 (PST)
Received: from mail.alphalink.com.au (mail.alphalink.com.au [203.24.205.7]) by mail03-lax.pilot.net with ESMTP id BAA24143 for <voynich@rand.org>; Tue, 25 Jan 2000 01:38:54 -0800 (PST)
Received: from LOCALNAME (d01-as15-mel.alphalink.com.au [202.161.98.64])
	by mail.alphalink.com.au (8.9.3/8.9.3) with SMTP id UAA02262
	for <voynich@rand.org>; Tue, 25 Jan 2000 20:38:37 +1100
Message-ID: <388DDF59.5636@alphalink.com.au>
Date: Tue, 25 Jan 2000 09:37:29 -0800
From: Jacques Guy <jguy@alphalink.com.au>
Reply-To: jguy@alphalink.com.au
X-Mailer: Mozilla 3.01 (Win16; I)
MIME-Version: 1.0
To: voynich@rand.org
Subject: Re: meaning-less/full
References: <200001221227.KAA02811@coruja.dcc.unicamp.br>
			 <12CPS7-0TZJ44C@fwd02.sul.t-online.de> <388B4068.C18A1B8E@nctimes.net> <12CrXU-0oye8mC@fwd01.sul.t-online.de> <388D605B.5653083A@nctimes.net>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Sender: jim@mail.rand.org
Status: OR

My silence does not mean that I have not
been paying attention. I have been trying
to put together a... not definition... an
exposition perhaps, of language in terms 
independent from mathematics, yet amenable
to mathematical analysis (which I shall 
leave  to you). It is not easy. I need
more time. Back to my thinking cap now.

Frogguy.

From jim@mail.rand.org  Tue Jan 25 21:32:01 2000
Return-Path: <jim@mail.rand.org>
Received: from mail-green.research.att.com (mail-green.research.att.com [135.207.30.103])
	by fry.research.att.com (980427.SGI.8.8.8/8.8.7) with ESMTP id VAA78846
	for <reeds@fry.research.att.com>; Tue, 25 Jan 2000 21:32:01 -0500 (EST)
Received: by mail-green.research.att.com (Postfix)
	id 2472B1E00D; Tue, 25 Jan 2000 21:32:01 -0500 (EST)
Delivered-To: reeds@research.att.com
Received: from mail01-lax.pilot.net (mail-lax-1.pilot.net [205.139.40.18])
	by mail-green.research.att.com (Postfix) with ESMTP id A107D1E002
	for <reeds@research.att.com>; Tue, 25 Jan 2000 21:32:00 -0500 (EST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail01-lax.pilot.net with ESMTP id SAA26436; Tue, 25 Jan 2000 18:31:57 -0800 (PST)
Received: from mail.rand.org (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id SAA27812; Tue, 25 Jan 2000 18:31:56 -0800 (PST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail.rand.org (8.9.3/8.9.3) with ESMTP id SAA13232 for <voynich@rand.org>; Tue, 25 Jan 2000 18:31:26 -0800 (PST)
Received: from mail02-lax.pilot.net (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id SAA27729 for <voynich@rand.org>; Tue, 25 Jan 2000 18:31:25 -0800 (PST)
Received: from mail.alphalink.com.au (mail.alphalink.com.au [203.24.205.7]) by mail02-lax.pilot.net with ESMTP id NAA26472 for <voynich@rand.org>; Tue, 25 Jan 2000 13:37:22 -0800 (PST)
Received: from LOCALNAME (d16-as1-mel.alphalink.com.au [202.161.100.16])
	by mail.alphalink.com.au (8.9.3/8.9.3) with SMTP id IAA23296
	for <voynich@rand.org>; Wed, 26 Jan 2000 08:37:12 +1100
Message-ID: <388E87C5.296A@alphalink.com.au>
Date: Tue, 25 Jan 2000 21:36:05 -0800
From: Jacques Guy <jguy@alphalink.com.au>
Reply-To: jguy@alphalink.com.au
X-Mailer: Mozilla 3.01 (Win16; I)
MIME-Version: 1.0
To: voynich@rand.org
Subject: Tibetan and tones
References: <200001241326.LAA04251@coruja.dcc.unicamp.br>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Sender: jim@mail.rand.org
Status: O

>     | bsnams nas mnyan yod kyi grong khyer chen por bsod snyoms kyi
>     | phyir zhugs so, ,de nas bcom ldan 'das mnyan yod kyi grong khyer
>     | chen por bsod snyoms kyi phyir gshegs nas bsod


One, it's not mind-boggling. Russian does it occasionally (Mtsevsk,
vskhlipnut'), southern Sakao systematically (tmhert, tmkkleprn),
and later Etruscan lost all post-tonic vowels, so that a
typical Etruscan word pattern was CVCC....

Two, I don't buy that "tones out of consonants" theory. Tones
are the first thing mastered by babies learning their 
language. They are the easiest thing to hear and to 
articulate. I have argued elsewhere, on the strength of tones,
that the stupidity of the claim that the Neanderthals did not have
language, because they could not articulate our consonants and
vowels very well. Just two consonants (e.g. a cough and a retch 
[hey Mark ;-]), two vowels, eight tones, a (C)V syllable
pattern... count them: 3x2x8 = 48 possible syllables. 
Rotokas, which has 5 vowels and 6 consonants, but is without
tones, manages only 35!

Three, granting the tones-out-of-consonants theory, how
come Mandarin has fewer tones than Cantonese, and also
has retained *fewer* consonants? How come Shanghaiese,
which has lost the most consonants, has only three 
tones? How come Shanghai students once developed a slang with
NO tones? (If I remember correctly, that was in the 1930's)

Frogguy

From jim@mail.rand.org  Wed Jan 26 01:15:47 2000
Return-Path: <jim@mail.rand.org>
Received: from mail-green.research.att.com (mail-green.research.att.com [135.207.30.103])
	by fry.research.att.com (980427.SGI.8.8.8/8.8.7) with ESMTP id BAA94325
	for <reeds@fry.research.att.com>; Wed, 26 Jan 2000 01:15:47 -0500 (EST)
Received: by mail-green.research.att.com (Postfix)
	id 85B7F1E00D; Wed, 26 Jan 2000 01:15:47 -0500 (EST)
Delivered-To: reeds@research.att.com
Received: from mail03-lax.pilot.net (mail-lax-3.pilot.net [205.139.40.17])
	by mail-green.research.att.com (Postfix) with ESMTP id 0C2611E008
	for <reeds@research.att.com>; Wed, 26 Jan 2000 01:15:47 -0500 (EST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail03-lax.pilot.net with ESMTP id WAA00625; Tue, 25 Jan 2000 22:15:43 -0800 (PST)
Received: from mail.rand.org (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id WAA04845; Tue, 25 Jan 2000 22:15:42 -0800 (PST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail.rand.org (8.9.3/8.9.3) with ESMTP id WAA20912 for <voynich@rand.org>; Tue, 25 Jan 2000 22:15:33 -0800 (PST)
Received: from mail02-lax.pilot.net (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id WAA04832 for <voynich@rand.org>; Tue, 25 Jan 2000 22:15:33 -0800 (PST)
Received: from relay01.chello.nl (smtp.chello.nl [212.83.68.144]) by mail02-lax.pilot.net with ESMTP id WAA00573 for <voynich@rand.org>; Tue, 25 Jan 2000 22:15:31 -0800 (PST)
Received: from node14b9d.a2000.nl ([24.132.75.157]) by relay01.chello.nl
          (InterMail vK.4.02.00.00 201-232-116 license a4501b83b68dc3e36f6046e1d8586abe)
          with SMTP id <20000126062241.UHH3293.relay01@node14b9d.a2000.nl>
          for <voynich@rand.org>; Wed, 26 Jan 2000 07:22:41 +0100
From: Miguel Carrasquer Vidal <mcv@wxs.nl>
To: voynich@rand.org
Subject: Re: Tibetan and tones
Date: Wed, 26 Jan 2000 07:14:57 +0100
Reply-To: mcv@wxs.nl
Message-ID: <gn3t8so4gnej395f9hvbqv5os6jjo6l241@4ax.com>
References: <200001241326.LAA04251@coruja.dcc.unicamp.br> <388E87C5.296A@alphalink.com.au>
In-Reply-To: <388E87C5.296A@alphalink.com.au>
X-Mailer: Forte Agent 1.7/32.534
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Sender: jim@mail.rand.org
Status: OR

Jacques Guy <jguy@alphalink.com.au> wrote:

>Three, granting the tones-out-of-consonants theory, how
>come Mandarin has fewer tones than Cantonese, and also
>has retained *fewer* consonants? How come Shanghaiese,
>which has lost the most consonants, has only three 
>tones? How come Shanghai students once developed a slang with
>NO tones? (If I remember correctly, that was in the 1930's)

Indeed.  The development of tones seems to have more to do with
the loss of vowels (or rather, syllables) than with the loss of
consonants, and even that's not a requirement.  However,
consonants (whether or not lost) do play a role in the shape that
the tones take.  If I remember correctly, unvoiced/aspirated
tends to be associated with falling tone, and voiced/glottalized
with rising tone (or was it the other way around?).

=======================
Miguel Carrasquer Vidal
mcv@wxs.nl

From jim@mail.rand.org  Mon Jan 31 19:17:10 2000
Return-Path: <jim@mail.rand.org>
Received: from mail-blue.research.att.com (mail-blue.research.att.com [135.207.30.102])
	by fry.research.att.com (980427.SGI.8.8.8/8.8.7) with ESMTP id TAA69928
	for <reeds@fry.research.att.com>; Mon, 31 Jan 2000 19:17:09 -0500 (EST)
Received: by mail-blue.research.att.com (Postfix)
	id 6ED614CE26; Mon, 31 Jan 2000 19:17:09 -0500 (EST)
Delivered-To: reeds@research.att.com
Received: from mail02-lax.pilot.net (mail-lax-2.pilot.net [205.139.40.16])
	by mail-blue.research.att.com (Postfix) with ESMTP id AA8694CE22
	for <reeds@research.att.com>; Mon, 31 Jan 2000 19:17:08 -0500 (EST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail02-lax.pilot.net with ESMTP id QAA06231; Mon, 31 Jan 2000 16:17:05 -0800 (PST)
Received: from mail.rand.org (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id QAA18125; Mon, 31 Jan 2000 16:17:03 -0800 (PST)
Received: from mail01-rand.pilot.net (unknown-23-138.pilot.net [204.48.23.138]) by mail.rand.org (8.9.3/8.9.3) with ESMTP id QAA29429 for <voynich@rand.org>; Mon, 31 Jan 2000 16:14:55 -0800 (PST)
Received: from mail03-lax.pilot.net (localhost [127.0.0.1]) by mail01-rand.pilot.net (8.8.5/8.8.5) with ESMTP id QAA17905 for <voynich@rand.org>; Mon, 31 Jan 2000 16:14:54 -0800 (PST)
Received: from grande.dcc.unicamp.br (grande.dcc.unicamp.br [143.106.7.8]) by mail03-lax.pilot.net with ESMTP id QAA09409 for <voynich@rand.org>; Mon, 31 Jan 2000 16:14:47 -0800 (PST)
Received: from amazonas.dcc.unicamp.br (amazonas.dcc.unicamp.br [143.106.7.11])
	by grande.dcc.unicamp.br (8.9.3/8.9.3) with ESMTP id WAA02292
	for <voynich@rand.org>; Mon, 31 Jan 2000 22:14:05 -0200 (EDT)
Received: from coruja.dcc.unicamp.br (coruja.dcc.unicamp.br [143.106.24.80])
	by amazonas.dcc.unicamp.br (8.8.5/8.8.5) with ESMTP id WAA08113
	for <voynich@rand.org>; Mon, 31 Jan 2000 22:14:02 -0200 (EDT)
Received: (from stolfi@localhost)
	by coruja.dcc.unicamp.br (8.8.5/8.8.5) id WAA12095;
	Mon, 31 Jan 2000 22:14:02 -0200 (EDT)
Date: Mon, 31 Jan 2000 22:14:02 -0200 (EDT)
Message-Id: <200002010014.WAA12095@coruja.dcc.unicamp.br>
From: Jorge Stolfi <stolfi@dcc.unicamp.br>
To: voynich@rand.org
Subject: Prague visit
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
Content-Type: text/plain; charset=iso-8859-1
Reply-To: stolfi@dcc.unicamp.br
Sender: jim@mail.rand.org
Status: ORr


I am going to a conference in Plzen (Czech Republic) next week, and I
plan to be in Prague for a couple of days after the conference. I
would love to do some Voynich fact-hunting while I'm there.
Any suggestions are welcome.

(Of course my chances of success are quite small, given that I don't
speak Czech nor German, and it will be at least 40 degrees C below my
operating temperature.  But I will try...)

All the best,

--stolfi

From reeds Mon Jan 31 19:28:41 2000
From: reeds@fry.research.att.com (Jim Reeds)
Message-Id: <1000131192841.ZM6465145@fry.research.att.com>
Date: Mon, 31 Jan 2000 19:28:41 -0500
In-Reply-To: Jorge Stolfi <stolfi@dcc.unicamp.br>
        "Prague visit" (Jan 31, 22:14)
References: <200002010014.WAA12095@coruja.dcc.unicamp.br>
X-Mailer: Z-Mail (4.0.1 13Jan97)
To: stolfi@dcc.unicamp.br, voynich@rand.org
Subject: Re: Prague visit
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Status: OR

On Jan 31, 22:14, Jorge Stolfi wrote:
Be sure to check out the preserved dodo in the Strahov Monastery
library.

 
> I am going to a conference in Plzen (Czech Republic) next week, and I
> plan to be in Prague for a couple of days after the conference. I
> would love to do some Voynich fact-hunting while I'm there.
> Any suggestions are welcome.



-- 
Jim Reeds, AT&T Labs - Research
Shannon Laboratory, Room C229, Building 103
180 Park Avenue, Florham Park, NJ 07932-0971, USA

reeds@research.att.com, phone: +1 973 360 8414, fax: +1 973 360 8178

