Last edited on 1998-07-15 04:19:23 by stolfi
Reeds-compressed Voynichese
The samples below were derived from the Biological section of the
Voynich manuscript by a compression process suggested by Jim Reeds:
repeatedly look for the most common digraph, and replace it by a new
symbol. (You can see here some scripts that implement
this transformation.)
The original alphabet was basic EVA (lower-case letters only).
For new symbols, we used the uppercase letters A,B,C, in order. The
comments in each file list the substitutions performed at each stage.
Word-based compression
These texts were compressed on a word by word basis,
i.e. the substituted digraphs did neither include nor
span word breaks.
- HEA / word ×10
The Herbal-A section, word compression, 10 stages
[ full ]
[ page ]
l = 1 r = 0
[ colorized page ]
[ bits per tuple ]
l = 2 r = 0
[ colorized page ]
[ bits per tuple ]
l = 3 r = 0
[ colorized page ]
[ bits per tuple ]
l = 1 r = 1
[ colorized page ]
[ bits per tuple ]
- HEA / word ×20
The Herbal-A section, word compression, 20 stages
[ full ]
[ page ]
l = 1 r = 0
[ colorized page ]
[ bits per tuple ]
l = 2 r = 0
[ colorized page ]
[ bits per tuple ]
l = 3 r = 0
[ colorized page ]
[ bits per tuple ]
l = 1 r = 1
[ colorized page ]
[ bits per tuple ]
- HEB / word ×10
The Herbal-B section, word compression, 10 stages
[ full ]
[ page ]
l = 1 r = 0
[ colorized page ]
[ bits per tuple ]
l = 2 r = 0
[ colorized page ]
[ bits per tuple ]
l = 3 r = 0
[ colorized page ]
[ bits per tuple ]
l = 1 r = 1
[ colorized page ]
[ bits per tuple ]
- HEB / word ×20
The Herbal-B section, word compression, 20 stages
[ full ]
[ page ]
l = 1 r = 0
[ colorized page ]
[ bits per tuple ]
l = 2 r = 0
[ colorized page ]
[ bits per tuple ]
l = 3 r = 0
[ colorized page ]
[ bits per tuple ]
l = 1 r = 1
[ colorized page ]
[ bits per tuple ]
- BIO / word ×10
The Biological section, word compression, 10 stages
[ full ]
[ page ]
l = 1 r = 0
[ colorized page ]
[ bits per tuple ]
l = 2 r = 0
[ colorized page ]
[ bits per tuple ]
l = 3 r = 0
[ colorized page ]
[ bits per tuple ]
l = 1 r = 1
[ colorized page ]
[ bits per tuple ]
- BIO / word ×20
The Biological section, word compression, 20 stages
[ full ]
[ page ]
l = 1 r = 0
[ colorized page ]
[ bits per tuple ]
l = 2 r = 0
[ colorized page ]
[ bits per tuple ]
l = 3 r = 0
[ colorized page ]
[ bits per tuple ]
l = 1 r = 1
[ colorized page ]
[ bits per tuple ]
Paragraph-based compresion
These texts were obtained by compressing a whole paragraph at
a time. Word breaks were replaced by hyphens, and handled
like any other letter. Paragraph breaks, however,
were still treated as hard barriers.