# Last edited on 2025-07-01 07:38:47 by stolfi # Intro and working notes for text25e1-01.xev # # TO DO # # Text cleanup: # # ??? Convert to IVTFF notation. # ??? Insert ()s around ALL implicitly ligated groups: (ch), (cth), etc. # # ??? Check &-codes in "H" version against Tak's original full-2-lig.evt. # ??? Check ()-ligatures in "H" version against capitalization of full-2-lig.evt. # ??? Consider using C1,C2 for the [|]-alternatives of C, etc. # ??? Consolidate transcriptions [EIM] and [ZYS] into a single series. # ??? Make sure that uncontroversial layout markers are uniform across all versions. # ??? Replace <%%%...%%> by <***...**> when transcriber found it unreadable. # ??? Restore "'" for plumes instead of "~" (in <>-lines only!). # ??? Uniformize parag markers <=> across versions. # # ??? Check all weirdos in old transcriptions for recent EVA font code changes: # # &{173} = old code of &{240} # &{141} = old code of &{241} # &{142} = old code of &{242} # &{143} = old code of &{243} # &{144} = old code of &{244} # &{157} = old code of &{245} # &{158} = old code of &{246} # # Alignment and consensus versions: # # ??? Find an encoding for ligatures suitable for majority/consensus voting. # # Locus partition and numbering: # # ??? Decide whether to split through as two columns. # ??? Decide whether to split as two labels. # ??? Decide whether to split into 10 nymph labels. # ??? Decide whether to split as 11 labels. # ??? Decide whether to split off as a title. # ??? In any case, split at gap. # ??? Make and other big red doodles into separate units. # ??? Renumber lines of from 1 instead of 0? # ??? Renumber pages to match the EVMT numbering? # ??? Revise the order of and . # ??? Split off the final of as . # # Invariants to check: # # ??? All lines end with proper delim [-=.,] (plus optional {}-comment). # ??? Check there is header before each locus. # ??? Check there is a header before each unit. # ??? Letters, spaces, and [()] are never mixed in the same column. # ??? Lines ending with "." are text rings. # ??? No nested or mismatched [()] in non-# lines. # ??? No nested or mismatched [{}] in non-# lines. # ??? No two consecutive spaces [-=.,] after deleting all [!%]s (but not [*]s). # # Comment cleanup: # # ??? Check all comments for final punctuation. # ??? Clean up the #-comments taken from the EVMT files (marked with "ยค"). # ??? Create <>-locators for the figures too, so that #-comments can be attached to them. # ??? Make sure that the 1st #-line after each <>{}-header is blank or a title. # ??? Normalize nomenclature "text ring", "text circle", "text band", etc. # ??? Normalize spacing and place of #-comments to show scope (page, unit, or line). # ??? Provide Petersen's numbers for the lines in . # ??? Put "*Title:" in front of 1st line after each <>{}-header. # ??? Recover Grove's numbers for all labels in f101v1 (right half of ). # ??? Remove from the unit and locus titles any info dup'ed from title of parent. # ??? Turn the unit and entry <>{}-headers into ##-comments for the sake of vtt? # ??? Uniformize "BLI04" for "2004 Beinecke scans" (or "images"). # ??? Zodiac pages: Add the o'clock position before each nymph label. # ??? Zodiac pages: Check the o'clock positions of labels/nymphs from onwards. # ??? Zodiac pages: give the o'clock positions of label, nymph and star separately. # ??? Zodiac pages: restore Petersen's "o'clock" identifications for nymph labels. # ??? Zodiac pages: standardize label position to be the start (middle?) of label. # # Documentation # # ??? Describe the file format and special conventions, e.g. <%>, line ends, etc.. # ??? Keep only one copy of each description item, in text20e1-51.evt or desc25e1-51.txt # ??? Make sure all pages have Rene's page number (or not). # ??? Observe that one or more <*>'s (like <%>) means *any number of* unreadable glyphs. # ??? Tabulate and describe the major layout markers <->, {fg}, {|+|}, etc.. # # NOT TO DO # # ??? Add H units for all non-Voynichese text. # ??? Consider splitting two-word labels of into separate loci. # ??? Consider using <-{=}> instead of <=> for parag breaks. # ??? Give conjectured Roman letters rather than graphic description for non-V text. # ??? In circular ring, put a {$} to mark the EVMT starting place? # ??? In herbal, split the P unit whenever there is a large gap between parags. # ??? Invent a notation for "possible parag break", e.g. <@>, <={?}>, <-{=?}>, etc. # ??? Nymph labels in biological section: either one unit per page, or one per figure. # ??? Renumber labels in as 1a,1b,..10a,10b instead of 1-20. # ??? Renumber outer labels in to start at 10:00. # ??? Split off as a separate unit? # ??? Split unit into separate units. # ??? Turn the comments "# There may be a parag..." into "possible parag break" marks. # ??? Use " {R:...}" for non-Voynichese text, rather than #-comments. #