# Last edited on 2026-03-03 11:09:13 by stolfi

001 Basic statistics of the biology section
002 Transforming Chinese into Voynichese.
003 Attempt to turn Arabic into Voynichese 
004 Effect of position-dependent ciphers on the word distribution.
005 
006 
007 
008 A prefix-midfix-suffix factorization of the bio section in EVA encoding.
009 A prefix-midfix-suffix factorization of the A and B herbal pages in EVA encoding.
010 Word distribution maps.
011 Trying to identify possible plant names in the herbal pages.
012 
013 Checking how hard it is to get Stojko-style "decipherments".
014 
015 Colorized version of the whole text.
016 Attempts to unscramble the anagrams on page f116v.
017 OKOKOKO: The fine structure of Voynichese words.
018 More on OKOKOKO: Are the "O"s modifiers for the "K"s?
019 
020 
021 Plotting word frequencies per page
022 
023 Plotting QOKOKOKO element frequencies per page
024 
025 Classifying OKOKOKO elements as word-initial, -medial, and -final
026 Colorizing text by entropy
027 Applying Jim Reeds's iterative digraph compression
028 Page-specific words
029 John Stojko's solution
030 Combining the p-m-s and OKOKOKO paradigms
031 Trying to build a Herbal-A to Herbal-B dictionary
032 Language portraits
033 Stylistic analysis of drawings
034 The "key-like sequences" collected
035 The pancake plants (f39r and f95r2) - comparing the texts
036 Theodore Petersen's plant index
037 A concordance of the VMs
038 Measuring the fuzziness of Voynichese
039 Classifying the VMs illustrations
040 Verbal descriptions of VMS pages
041 Cleaning up the scanned pages
042 Label occurrences in the text
043 Formatting the Voynich e-mail archives
044 
045 
046 
047 
048 
049 A checklist for Beinecke pilgrims
050 
051 Detailed label occurrence map in the Biological section
052 Reordering the biological pages by shared key words
053 Reordering the pages by word and element frequencies
054 Word length distribution, revisited
055 Occurrences of the OL/OR words
056 OKOKOKO word component statisticsts 
057 Statistics of crust-mantle-core components
058 Probabilistic model for Voynichese words
059 Spatial distribution of gallows and tables
060 Complete and strict label concordance
061 Comparing the Recipes section to the Shennong Bencao
062 Occurrences of Grove words
063 
064 Looking for word quadruples with skew frequencies
065 Ink separation experiments
066 Looking for maximal repeated seqs in various langs
067 Looking for words with lumpy distribution
068 Language transducers with specified output freqs
069 
070 
071 
072 
073 VMS page descriptions for the 25e1 release
074 Revising the "U" (Stolfi) transcription
075 Visual analysis of retracing
076 Getting a clean transcription of the Recipes section
077 Further comparisons of Recipes to the Shennong Bencao 
078 Analyzing the rusty stains on f103v
079 Quire descriptions for the 25e1 release
080 Analysis of the multispectral images
081 Recreating the Scribe's misalignment of f34r 
082 Character and digraph frequencies in formulaic text
083 Generating pseudo-English by Thorsten and Timm
084 Investigating the line-initial words and glyphs of f83r
085 Generating texts with same language diff spelling
086 Enhancing text and drawings covered by green paint
087 Investigating offset printing between quires
088 On the nature of the VMS text ink
089 Investigating repeats across line breaks
090 Generating pseudo-Voynchese with high order Markov
091 Checking matches between Herbal and Pharma drawings
092 Generating fractional word count and frequency lists per section
093 Parsing words into elements and counting them
094 Visualizing the evolution of pharma jars
095 Investigating the spacing of {q} elements in parags
096 Investigatin similarities between languages A and B

# Notes for the statistical analysis technical report

100 Preparing a clean Voynichese sample for analysis 
101 Preparing clean samples of various other languages 
102 Tabulating the most popular words and labels
103 Statistics of basic glyphs, strokes, and pairs thereof
104 Listing duplicate words in Voynichese and various languages
105 Extracting images of weirdos and unreadable glyphs
107 Computing and comparing word and token length distributions
108 Computing and comparing entropy profiles
109 Computing the token entropy
110 Comparative word rank-frequency (Zipf law) plots
111 Analyzing the length distribution of Vietnamese word elems
112 Analyzing occurrence and context of one-leg gallows

# Notes for the word structure technical report

201 Statistics of crescent and circle sequences
202 Statistics of OKOKOKO elements
203 Statistics of classical word paradigms

# Obsolete entries

502 Studying the Voynich "words" with finite automata.
503 Automaton-based analysis with reduced alphabet.
504 Attempt at factoring the 'edy' and 'air' word classes.
505 A complete factorization of words in ERA alphabet.
506 An alternative complete factorization of words in ERA alphabet.
507 A complete factorization of words in the original EVA alphabet.
519 Creating per-page and per-section "best pick" text files.
520 Gordon Rugg's hoax hypothesis

600 Converting the Interlinear to EVA.
614 Merging John Grove's list of labels into the interlinear file. 
619 Analyzing word frequencies per section
622 Analyzing QOKOKOKO element frequencies per section
624 Creating a new interim release of Landini's interlinear in EVA
644 Adding line numbers to Takahashi's transcription
645 Computing consensus/majority editions of the EVA interlinear
646 Producing an interlinear for the EVMT team with EVMT loci
647 Converting the interlinear to EVMT-like loci
648 Mapping Stolfi locators to the original Landini locators
650 Public text colorization service
669 Preparing the interlinear 16e7 for merging with the EVMT file
670 Preparing the EVMT files for merging with the 16e7 interlinear
671 Preparing a table that maps EVMT locators to 16e7 locators
672 Merging the EVMT file with release 16e7 for release 20e1