Hacking at the Voynich manuscript - Side notes 034 The "key-like sequences" collected INTRODUCTION Here and there in the VMs one finds sequences of very short "words", each "word" having either a single EVA letter, or a group of EVA letters that tend to occur together. Those sequences are traditionally called "key-like sequences", although there is yet no proof that they have anything to do with cryptography. Those sequences are nonetheless valuable hints about the "true" Voynichese alphabet. The aim of this note is to collect them into a single document, in convenient format. LOCATIONS The pages where key-like sequences have been noted are: f1r alphabets (Latin, Voynichese, Latin) down right margin. f49v digits and Voynichese letters down left margin. f57v three rings of Voynichese letters, words, and weirdos. f66r column of letters down middle of page. f69r letters around central star. f75v vertical `word' at upper right corner. f76r vertical `title' down left margin. SPECIAL SYMBOLS The key-like sequences include several Voynichese glyphs that are otherwise rare ("weirdos") or look like isolated parts of EVA characters. For brevity we write them as &X where X is a single capital letter, according to this table: &X EVA description -- ---- ------------------------------------------------------ &C ??? squarish <c>; <I> with bent ligature. &D &140 (or &163) right half of <o> &G &171 gallows with three loops. &H &170 like <h> with top detached. &I I first half of <ih>. &J ??? like <c> plus two stacked <o>s joined by a short bar. &K ??? similar to <&G>, with an extra loop (or separate letter). &L &169 like <Lo^> with <o> above the <L>. &R &192 like <r> with the <i> almost horizontal, almost a <2> &T &195 upside down lambda with serif. &U ??? rounded <v>, or top half of <o> or <y>. &Y &172 upside down lambda without serif, or <r> with short plume. PAGE 1r Jim Reeds writes [15 Jul 94]: "The erased key on f1r is discussed by Brumbaugh. It seems to have 3 vertical columns of letters. The leftmost is the ordinary alphabet, lower case italic hand, a through z. I could not check for the presence of every letter (I'm not sure about j, for instance) but a, b, c, ... o, p, q, r, s, ... y, z are pretty clear. Next to those are very spotty frags of Voynich letters. I could make out [EVA] <d> next to a, <r> next to c, <g> next to y, and one of the gallows letters somewhere near the q, r, s range. [...] The 3d column seems to be 1 off from the first: italic minuscules, r next to s, and so on. More is visible in UV shots than Petersen shows." PAGE f49v This page looks like an ordinary Herbal page, but has an extraordinary amount of text (26 lines), and a column of "letters" running down the left margin. Here they are: f o r y e &D k s -- -- -- -- -- -- -- -- 01 02 03 04 05 06 07 08 p o &R y e &D s -- -- -- -- -- -- -- 09 10 11 12 13 14 15 p o &R y e &D -- -- -- -- -- -- 16 17 18 19 20 21 d y e k y -- -- -- -- -- 22 23 24 25 26 The line breaks above are meant to highligh the near-periodicity of the first 21 lines. In the original page, each character is clearly aligned with one text line, with a single word space between them. The sequence follows the slightly tilted and bent outline of the text, and its letters are spaced like the text lines, including a wider gap between positions 13 and 14, where matching an extra wide paragraph break. The alignment of the sequence letters with the text lines is puzzling because the line breaks do not seem to be significant: the text is formatted as paragraphs, and the right margin seems defined by the irregular outline of the plant. On the other hand the "period" of the sequence gets shorter as the lines get wider; so perhaps these symbols are counting words, or letters? To the left of items 02--06 there are Western digits "1" through "5". It is not obvious whether they are in the same hand as the sequence and/or the main text. The unusual characters <&R> and <&D> and the deformed shape of some letters like <k> and <y> suggest that both the "key-like sequence" and the digits may be later additions. Perhaps the reason why the digit sequence starts on line 2 is that the scribbler though that <r> was the digit "2", and tried to line them up. That may also be the reason why he used <&R> instead of <r> further down the sequence. PAGE f57v Page f57v contains a B-language paragraph and a circular diagram with four rings of text, surrounding four figures and some phrases at odd angles. The third ring of text, from the center out (f57v:3) contains a list of 17 isolated "letters", repeated 4 times with slight variations. An extra-wide gap at 10:30 and an isolated word at 11:00 outside the diagram strongly suggest the sequence starts there. o l j r v x k m f &L t r &H &G y &I &Y o l d r v x k m f &L t r &H &G y c &Y o l d r v x k m p &L t r &H &G y c &Y o l d r v x k m p &L t r &H &G y c &Y -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- 01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 As Dennis Mardle observed, these variations suggest that the pairs <j> and <d>, <f> and <p>, and <&I> and <c> are equivalent (or at least closely related). On the other hand, the consistent occurrence of <k> and <t> in their respective places argues for them being distinct. Note that charater 16 and its periods have a well-developed ligature to the right, so the reading <&I>/<c> seems more correct than <i>/<e>. The second-innermost ring of text (f57v:2) has a few "letters" interspersed among normal words: daiin otey ofchey shes o d okchod l o - - - - lkeeol dkedar o f aros r y - - - - chedaiin k chety x docodal v,o tchor sh - - - - -- tedar dal &T daiin aiin otal daro v -- - The innermost ring (f57v:1) contains mostly single-letter words with a few ordinary-looking words. Some spaces are questionable, but a case can be made for 34 items in all: x r l o v l ------- ------- ------- ------- ------- ------- 01 02 03 04 05 06 r m aiin d &H &C ------- ------- ------- ------- ------- ------- 07 08 09 10 11 12 f r y l k x ------- ------- ------- ------- ------- ------- 13 14 15 16 17 18 l r &K,ar o r &U ------- ------- ------- ------- ------- ------- 19 20 21 22 23 24 t l s d y d,ar ------- ------- ------- ------- ------- ------- 25 26 27 28 29 30 teodar otadal sheky otchody ------- ------- ------- ------- 31 32 33 34 PAGE f66r Page f66r is on the same face of the same bifolio as f57v. It contains a column with 15 "labels" (normal-length words), another column with 34 "letters", and a third column with 32 lines of "B"-language text grouped into several paragraphs. Here are the letters in the middle column: y o s sh y d o f * x air d ---+--- ---+--- ---+--- ---+--- ---+--- ---+--- 01 02 03 04 05 06 07 08 09 10 11 12 sh y f f y o d r f c &T x ---+--- ---+--- ---+--- ---+--- ---+--- ---+--- 13 14 15 16 17 18 19 20 21 22 23 24 t o &T l r t o x p d ---+--- ---+--- ---+--- ---+--- ---+--- 25 26 27 28 29 30 31 32 33 34 [ Can anyone provide a reading for #09? ] The <sh> in position 13 looks more like <&I'h>. The two <f>s in positions 15 and 16 are different: the horizontal stroke that crosses the leg is straight in #15, but ends with a <c>-like hook in #16. The corresponding words on the left column also show this difference. In the case, the difference may be explained by crowding -- the hook can be drawn only then the <f> is word-initial. But this explanation does not apply to the isolated letters. Hmmm... Note again that entry 22 is definitely a <c> and not an <e>. The three columns begin and end more or less in sync (one label for every two letters and two lines of text) but get out of alignment around the middle. Note that 34 = 2 × 17. One possibility is that two words (and the alignment) were lost by the scribe, having 17 words in column 1, one for each of the 17 pairs of "letters" on column 2. Should the middle column be read as 17 pairs, as suggested by the initial alignments with the left column? Some of the pairs do seem to have some logic, especially if we note that page f66r lies at the boundary between languages A and B: pair 1: <y> and <o> - occur in similar contexts ans bay be equivalent pair 2: <s> and <sh> - have similar shapes and distributions. pair 3: <y> and <d> - are dissimilar, but <dy> is a characteristic ending in language B. pair 4: <*> and <x> - ? pair 5: <o> and <f> - the <o> often occurs before <f>. pair 6: <aiin> and <d> - <daiin> is the most common word of language A. pair 7: <sh> and <y> - <shy> is also a common combination. pair 8: <f> and <f> - contrasting straight and hooked arm? pair 9: <y> and <o> - again! see above. pair 10: <d> and <s> - somewhat similar shapes and distributions. pair 11: <f> and <c> - the <c> often preceds <f> pair 12: <&T> and <x> - two weirdos. pair 13: <t> and <o> - often together. pair 14: <&T> and <l> - ? pair 15: <r> and <t> - ? pair 16: <o> and <x> - ? pair 17: <p> and <d> - the former may substitute for the latter? My guess is that the topic of this page is the Voynichese language itself, perhaps an explanation of the differences between "languages" A and B (or a change in the spelling system). PAGE f69r This "cosmological" page displays a circular diagram, at whose center is a disk divided into six unequal parts by the rays of a star. Each sector is labeled with one(?) Voynichese letter. Clockwise from 01:45 (the presumed "start" of the surrounding diagram) the letters are l s &J y d o -- -- -- -- -- -- 01 02 03 04 05 06 The <&J> has some topological resemblance to EVA <cd>, but the shape and angles do not seem to match. PAGE f75v On the upper right corner of this page there is a key-like sequence with five letters, all but the last one aligned with the text lines. Here they are: s l l o r -- -- -- -- -- 01 02 03 04 05 The whole sequence lies above the final letter (<r> or <s>) of the 5th text line. It can be argued that the sequence includes that letter too. PAGE f76r This page has no drawings, only four paragraphs of text. The first paragraph takes half a page (29 lines), and has a column of "letters" along its left margin. The letters are irregularly spaced along the column, but seem to be aligned with the main text lines. Here they are: s d q s o l k r s -- -- -- -- -- -- -- -- -- 01 02 03 04 05 06 07 08 09 CONCLUSIONS The key-like sequences should give us clues about the nature of the Voynichese alphabet, especially about the grouping of strokes into characters and the equivalence of character shapes. There are two caveats we should keep in mind, however. First, some sequences may not be original, but scribblings by some later owner (or library patron) who tried to decipher the manuscript. This warning is particularly apt for those sequences that have parallel sequences of western letters or digits, such as the sequence on f1r and the vertical sequence on f49v. The latter is dubious also because it uses characters that are not found elsewhere, not even on other key sequences, and its symbols are somewhat ill-proportioned. On the other hand, it is somewhat unlikely that someone would scribble a "key" on a potentially valuable book, before he knew whether the key worked or not. And the warning hardly applies to the sequences on f57v, f66r, and f69v, that are obviously part of the original layout. Second, the symbols used in the sequences may not all be letters. In modern texts one finds plenty of non-letter symbols, often mixed with letters, in certain special contexts: digits, punctuation, footnote daggers and stars, item bullets, paragraph and section signs, copyright marks, mathematical symbols, etc. Obviously the key-like sequences above are all "'special contexts" where such symbols are expected to occur. In particular, with regards to the 4×17 sequence of f57v, it is hard to believe that the <&L> ("Lo^") symbol (which occurs nowhere else in the text) is a letter of the alphabet. Keeping these caveats in mind, what can we learn from the key-like sequences? First, the key sequences suggest that the following symbols are complete single letters: confirming sequences ----------------------------------------------- symb 49v 57v:3 57v:2 57v:1 66r 69r 75v 76r ----- ----- ----- ----- ----- ----- ----- ----- ----- <o> XXX XXX XXX XXX XXX XXX XXX XXX <y> XXX XXX XXX XXX XXX XXX - - ----- ----- ----- ----- ----- ----- ----- ----- ----- <r> XXX XXX XXX XXX XXX - XXX XXX <s> - - - XXX XXX XXX XXX XXX <l> - XXX XXX XXX XXX - XXX XXX ----- ----- ----- ----- ----- ----- ----- ----- ----- <d> XXX - XXX XXX XXX XXX - XXX <m> - - - XXX - - - - ----- ----- ----- ----- ----- ----- ----- ----- ----- <c> - XXX - - XXX - - - <&I> - XXX - - - - - - <e> XXX - - - - - - - ----- ----- ----- ----- ----- ----- ----- ----- ----- <f> XXX XXX XXX XXX XXX - - - <p> XXX XXX - - XXX - - - <k> XXX XXX XXX XXX - - - XXX <t> - XXX - XXX XXX - - - ----- ----- ----- ----- ----- ----- ----- ----- ----- <sh> - - XXX - XXX - - - ----- ----- ----- ----- ----- ----- ----- ----- ----- <q> - - - - - - - XXX ----- ----- ----- ----- ----- ----- ----- ----- ----- <v> - XXX XXX XXX - - - - <x> - XXX XXX XXX XXX - - - ----- ----- ----- ----- ----- ----- ----- ----- ----- <aiin> - - - XXX - - - - <air> - - - - XXX - - - ----- ----- ----- ----- ----- ----- ----- ----- ----- <&C> - - - XXX - - - - <&G> - XXX - - - - - - <&H> - XXX - XXX - - - - <&J> - - - - - XXX - - <&K> - - - XXX - - - - <&L> - XXX - - - - - - <&T> - - - - XXX - - - <&Y> - XXX XXX - - - - - <&U> - - - XXX - - - - ----- ----- ----- ----- ----- ----- ----- ----- ----- The following EVA letters and proposed letters are conspicuously absent from the sequences: <a> <b> <g> <h> <i> <n> <u> <z> <ch> <ee> <eee> <cth> <ckh> <cph> <cfh> <ith> <ikh> <iph> <ifh> <ct> <ck> <cp> <cf> Note in particular that <aiin> and <air> are used as single letters. On the other hand the sequences do not give any direct evidence for the distinction between <aiin> and its variants <ain>, <aiiin>, <oiin>, etc. The absence of <a> in these sequences supports the claim that <a> is simply a calligraphic variant of <o> or <y>. On the other hand the sequences strongly suggest that <o> and <y> are distinct letters. Note that <e> too is absent, except for the highly suspect sequence f49v; this sequence also include <&D> (the right half of <o>), not used anywhere else. This suggests that <e> is not an independent letter, but rather a modifier of a nearby letter, as suggested by the OKOKOKO analysis. The letter <c> is distinct from <e>. Its occurrences on f57v:3 (the 4×17 sequence) and f66r show a well-marked ligature, which would be superfluous if <c> were just a calligraphic variant of <e>. The letters <&I> and <c> are probably equivalent. This is directly supported by their occurrence in equivalent spots of the 4×17 sequence, and indirectly by the fact that other sequences that include <c> do not include <&I>. The absence of the `platform gallows' (<cth> and variations) confirms the longstanding suspicion that they are contractions of the plain gallows with other characters, possibly <ch>. It seems that <m> is distinct from <d> (and <j>), as attested by f57v:3. On the other hand, f57v:3 suggests that <d> and <j> are the same letter. The distinction between <r> and <s> is supported by their clearly drawn samples in the sequences of f66r, f75v, and f76r. Sequence f49v, for whatever it is worth, suggests that <r> and <&R> are the same. The letters <f> and <p> may be equivalent. This is suggested by their occurrence in corresponding spots of the f49v and f57v:3 sequences. However, their separate occurrences on page f66r weakly suggests that they are distinct. On the other hand, f49v and f57v:3 suggest that <f> and <p> are distinct from <t> and <k>. Last edited on 1999-02-01 04:52:00 by stolfi