Some interesting patterns are apparent above, but things may become clearer
  if we remove the garbage.
  
  Meanwhile, here is a count of the digraphs in the "good" words (counting
  repeated words):

                    o     c     t     i     q     l |     v     x     y     j     g     s   TOT
          ----- ----- ----- ----- ----- ----- -----   ----- ----- ----- ----- ----- ----- -----
              0  1363  2574     0   502  1874   107 |     0     0     0     0     0     0  6420
        o    34     4   112     0  1680   635  1430 |     0     0     0     0     0     0  3895
        c     7   172  3922  1447  2002   197   291 |     0     0  3764    12  2728  1445 15987
        t     7    96  2696     0    23    18    24 |     0     0     0     0     0     0  2864
        i     6     5    31     0  3395     3     3 |   943  2349     0    57     0   913  7705
        q     5  1622    35     0     0     2     4 |     0     0     0   969   215     0  2852
        l     0     0     0     0     0     0     0 |     0     0     0  2185    36     0  2221
          ----- ----- ----- ----- ----- ----- -----   ----- ----- ----- ----- ----- ----- -----
        v   912    10    18     0     3     0     0 |     0     0     0     0     0     0   943
        x  1085   159   757     0    22    50   276 |     0     0     0     0     0     0  2349
        y  3510     7    72     0    37    66    72 |     0     0     0     0     0     0  3764
        j    84   138  2658   319    23     0     1 |     0     0     0     0     0     0  3223
        g    78   123  2729    25    14     2     8 |     0     0     0     0     0     0  2979
        s   692   196   383  1073     4     5     5 |     0     0     0     0     0     0  2358
          ----- ----- ----- ----- ----- ----- -----   ----- ----- ----- ----- ----- ----- -----
      TOT  6420  3895 15987  2864  7705  2852  2221 |   943  2349  3764  3223  2979  2358 57560

      Next-symbol probabilities (× 99):

                    o     c     t     i     q     l |     v     x     y     j     g     s   TOT
          ----- ----- ----- ----- ----- ----- -----   ----- ----- ----- ----- ----- ----- -----
              .    21    40     .     8    29     2 |     .     .     .     .     .     .    99
        o     1     .     3     .    43    16    36 |     .     .     .     .     .     .    99
        c     .     1    24     9    12     1     2 |     .     .    23     .    17     9    99
        t     .     3    93     .     1     1     1 |     .     .     .     .     .     .    99
        i     .     .     .     .    44     .     . |    12    30     .     1     .    12    99
        q     .    56     1     .     .     .     . |     .     .     .    34     7     .    99
        l     .     .     .     .     .     .     . |     .     .     .    97     2     .    99
          ----- ----- ----- ----- ----- ----- -----   ----- ----- ----- ----- ----- ----- -----
        v    96     1     2     .     .     .     . |     .     .     .     .     .     .    99
        x    46     7    32     .     1     2    12 |     .     .     .     .     .     .    99
        y    92     .     2     .     1     2     2 |     .     .     .     .     .     .    99
        j     3     4    82    10     1     .     . |     .     .     .     .     .     .    99
        g     3     4    91     1     .     .     . |     .     .     .     .     .     .    99
        s    29     8    16    45     .     .     . |     .     .     .     .     .     .    99
          ----- ----- ----- ----- ----- ----- -----   ----- ----- ----- ----- ----- ----- -----
      TOT    11     7    27     5    13     5     4 |     2     4     6     6     5     4 57560

      Previous-symbol probabilities (× 99):

                    o     c     t     i     q     l |     v     x     y     j     g     s   TOT
          ----- ----- ----- ----- ----- ----- -----   ----- ----- ----- ----- ----- ----- -----
              .    35    16     .     6    65     5 |     .     .     .     .     .     .    11
        o     1     .     1     .    22    22    64 |     .     .     .     .     .     .     7
        c     .     4    24    50    26     7    13 |     .     .    99     .    91    61    27
        t     .     2    17     .     .     1     1 |     .     .     .     .     .     .     5
        i     .     .     .     .    44     .     . |    99    99     .     2     .    38    13
        q     .    41     .     .     .     .     . |     .     .     .    30     7     .     5
        l     .     .     .     .     .     .     . |     .     .     .    67     1     .     4
          ----- ----- ----- ----- ----- ----- -----   ----- ----- ----- ----- ----- ----- -----
        v    14     .     .     .     .     .     . |     .     .     .     .     .     .     2
        x    17     4     5     .     .     2    12 |     .     .     .     .     .     .     4
        y    54     .     .     .     .     2     3 |     .     .     .     .     .     .     6
        j     1     4    16    11     .     .     . |     .     .     .     .     .     .     6
        g     1     3    17     1     .     .     . |     .     .     .     .     .     .     5
        s    11     5     2    37     .     .     . |     .     .     .     .     .     .     4
          ----- ----- ----- ----- ----- ----- -----   ----- ----- ----- ----- ----- ----- -----
      TOT    99    99    99    99    99    99    99 |    99    99    99    99    99    99 57560

  Note that the stroke `l' is always followed by either `j' or `g', hence `lj'
  and `lg' should be single letters.
      
  Note also that there are two clearly different kinds of strokes, "body" B =
  {`c',`o',`t',`i',`q',`l'} and "limb" L = {`v',`x',`y',`j',`g',`s'}.  If we
  reduce the digraph count matrix to these two classes, plus word break W, we
  get

    cat bio-c-jsa-gut.wds \
      | tr 'cotiqlvxyjgs' 'BBBBBBLLLLLL' \
      | count-digraph-freqs

    Digraph counts:

                  B     L
        ----- ----- -----
            .  6420     .
      B    59 19849 15616
      L  6361  9255     .
        ----- ----- -----

    Next-symbol probabilities (× 99):

                  B     L
        ----- ----- -----
            .    99     .
      B     .    55    44
      L    40    59     .
        ----- ----- -----

    Previous-symbol probabilities (× 99):

                  B     L
        ----- ----- -----
            .    18     .
      B     1    55    99
      L    98    26     .
        ----- ----- -----

  Note that every word begins with a body stroke; this was expected from the
  definition of the limb strokes (they can be recognized only by their
  relationship to a previous stroke).  Note also that a limb stroke cannot be
  followed by another limb stroke; this too is not wholly unexpected.

  The surprise is that almost no words *end* in a body stroke.  The least rare
  body stroke in word-final position is `o'.  Here are all the words that end
  in body strokes, in context.  (The "<<" marks the error)

                ctix ois cstcoixo << //
         ciix ciix ctccgciiivcgci << //
       qoljciiiv qjcy ixcstcgcyqo << //
                   qoqjcccgcy ixo << // 
                  oljccgciis cgci << // 
                  qoqjciix ctoixo << // 
                 cgciiscy cgciixo << // 
             cgcy cgciiiiv ctcqjt << // 
             oljcccg qoljciixoiso << // 
           qoljciix qoljccgcy ixo << // 
          cstcccgcy ixctccgcy ixc << // 
          cstccgcy qoljcyisixcstc << // 
          cstcljtcy oisciiisciiso << // 
         isctccgcy qoqjcccgcy ixo << // 
         qoljcccy qoqjcccgcy ixoc << // 

          ljctoix qjctciixoixljcc << cylgctccy 
                      isctcs ctcc << oixcs ciiiiv csljciix
               oixcstccy oixcstcc << qoixcstcccy qoqjcccy 
      cyctcccgcy csciiivo oix ctc << cgcy qoljciis 
               csciiiiv cstccgcci << cstccqjtcy ctccy
           ixctcgcy cgctcgcy cgci << qciis 
                ixljocgciix oqgci << ljoisoixis // 
            qcqjtcccg cstcgcy qci << oixljcccy cgciiiv  
                             // o << cgctcccgcy qoixctccy
                     cyctccciis o << oiiiv occcgcy 
                ctccy qoljciiiv o << isoiiiiv // 
                  ciiiv isciiiv o << ljciiv ctixciiiiiv // 
               cycstccis ciiiiv o << cyljcccgcy qoljcccgcy
               // csciiiv oljci*o << ctccgcy ixljctccgcy 
               cstccqjtcy qoljcio << ixois // 
             cgcicljtcy ixljciijo << cyljcccy ixcstccy
             cstccyljcccgcy ixljo << oqgctccgcy
         qoix oixciiiv cstcoix qo << qoljciix cstcivix // 
                     qoixctccy qo << ixctccg qoixljcy
                      ctcccgcy qo << ctciis ciiiiv // 
             oljcccy ixctccgcy qo << oixciiiv oqj 
                 // cycstccgcy qo << ois oljciiiv 
              cstcqjtcy ctcocy qo << qoljccy cgciixciiiiv 
                ctcccgcy qoix oqo << qoljciiiv oixcstcciij // 
                  qocgctccg oixqo << cgciis ctccljto ixoixoix 
                      cycstcccyqo << isciis oix ctcccy
            oixqo cgciis ctccljto << ixoixoix
                        // ixcsto << qoljccy ixcstccgcy
              cyctcccgcy csciiivo << oix ctc cgcy qoljciis 
          qoljctcgcy ctcqjccy ixo << qoljccgcy qoljciiv     
             qoljctcgcy ctccy ixo << ctcljcy oix 
           qois *cccy ixctccy ixo << cycgciiiv cstccy 
      csciiiisciix csciix cgciixo << qjciiiv cgciiscy cgciixo // 
            oisoix ccccsciix oixo << qjcoix oiscy // 
                             // q << ljciiiv cstccqjcy qoljciiiv 
                             // q << ljccccy cstccgcy qoljcccgcy
                             // q << qoqjccgcstccgcy
           ctqjciiis oqgctccgcy q << qjciix ctccgcy ctcqgctccgcy
             csciix ctccgcy cstcq << lgctois qoqjcicsoixljcy 
          qoljciis cstcccgcy ixct << cstoljciiiv ct*            
                  // ctccy ctcljt << cstcoqccccy ctoix
                  // cgciiiiv cst << qoixctccy 
           oqjciixcy qoljciix cst << oixcstciixcscy // 
                      // qoljccst << qoljccgcy oqjccgcy
           // cgciiiv ctccy ixcst << cgciiiiv ctccy //          

  Those of the first group appear to be interference by the line break.  (Note
  that the manuscript does not appear to use any hyphenation mark.  Either
  words are not broken across lines, which would be unusual, or they are broken
  without any extra marks, which would produce the Those of the second group
  appear to be due to bogus word breaks in the transcription (e.g. between the
  `q' and `l') or transcription errors.

  An interesting observation from the body/limb frequency tables above
  is that the transition probabilities from body stroke to body and
  limb are respectively 55% and 45%.  Thus, if the limb strokes mark
  the end of a syllabe (or letter?), the the average number of body
  strokes in a syllabe is slightly over 2.  (Considering that we are
  counting each "i" as a body stroke, the correct number may well be
  precisly 2.)

  I decided that, before spending more time in the analysis, I must first
  prepare a "corrected" interlinear where discrepancies between FSG and Currier
  are resolved taking into account the probabilities above.

  The idea is to make a dictionary of 5-tuples, and try to use it 
  to decide on the corrections.  Namely, define the context of a letter
  occurrence in a text as its four nearest letter occurrences. 
  We can represent a context in sed-like notation as wx.yz where
  the "." is the position of the central letter.
  
  We scan some training text, collecting for each possible context
  the frequency distribution of the middle letter.  At the end, if,
  for a given context wx.yz there is some central letter t which 
  is more likely than all the others combined, we output a correction
  rule of the form wx?yz -> wxtyz.
  
  So, here is the work.  First, I generated a training data set:

    cat bio-m-evt.evt \
      | egrep '^<.*;[FC]> ' \
      | grep -v '[][%*_]' \
      | sed \
          -e 's/<.*;[FC]> */  /g' \
          -e 's/{[^}]*}//g' \
          -e 's/\!//g'\
      > .train.txt

     lines   words     bytes file        
    ------ ------- --------- ------------
       858     858     42866 .train.txt

  Next, I generated the correction patterns from it:
  
    cat .train.txt \
      | generate-fix-patterns -vMINOCC=10 \
      > .fixit.sed
       
     lines   words     bytes file        
    ------ ------- --------- ------------
       592     688     10219 .fixit.sed
       
  The parameter MINOCC is the minimum number of times a context must occur before
  we try to generate a correction rule for it. 

  Next, I generated a "consensus" interlinear file:
  
    cat bio-m-evt.evt \
      | make-consensus-interlin \
      > bio-x-evt.evt

  I extracted the consensus text from it:
  
    cat bio-x-evt.evt \
      | egrep '^<.*;J> ' \
      | sed \
          -e 's/{[^}]*}//g' \
          -e 's/[\!]//g' \
      > bio-j-evt-raw.evt

  I applied the corrections:
  
    cat bio-j-evt-raw.evt \
      | sed -f .fixit.sed \
      > bio-j-evt.evt
      
  Now let's extract the words and check how many good ones we got:
  
    cat bio-j-evt.evt \
      | sed \
          -e 's/<.*;[A-Z]> *//g' \
          -e 's/- *$/.\/\//g' \
          -e 's/= *$/.\/\/.=/g' \
      | tr '.' '\012' \
      | egrep '.' \
      > bio-j-evt.wds
         
    cat bio-j-evt.wds | sort | uniq -c | sort +0 -1nr > bio-j-evt.frq
    
    cat bio-j-evt.wds | sort | uniq > bio-j-evt.dic
    
     lines   words     bytes file        
    ------ ------- --------- ------------
      7216    7216     39223 bio-j-evt.wds
      1761    1761     12154 bio-j-evt.dic
    
  I extracted the good words:
  
    cat bio-j-evt.wds | grep -v '?' > bio-j-evt-gut.wds
    
    cat bio-j-evt-gut.wds | sort | uniq > bio-j-evt-gut.dic

    cat bio-j-evt-gut.wds | sort | uniq -c | sort +0 -1nr > bio-j-evt-gut.frq
    
     lines   words     bytes file        
    ------ ------- --------- ------------
      6188    6188     31705 bio-j-evt-gut.wds
      1085    1085      6532 bio-j-evt-gut.dic
    
  I created an automaton for bio-j-evt-gut.dic:
  
    cat bio-j-evt-gut.dic \
      | nice MaintainAutomaton \
          -add - \
          -dump bio-j-evt-gut.dmp

     strings  letters    states   finals     arcs  sub-sts lets/arc
    -------- --------  -------- -------- -------- -------- --------
        1085     5447       422       90     1263     1027    4.313

  I looked for unproductive states:

    nice AutoAnalysis \
      -load bio-j-evt-gut.dmp \
      -unprod bio-j-evt-gut-1-unp.sts \
        -maxUnprod 1 \
        -unprodSugg bio-j-evt-gut-1-unp.sugg
        
       46 unproductive states
       46 strange words (with repetitions) listed
       31 strange words (without repetitions) listed

    // 2PTG 42OEDCC8G 4GDAM 4O4ODCCG 4ODGE88G 8ARTCC8AE 8OEDCC2OE
    CPAEOIR DOHAEG EODAK GCPAM GDT8AR HG4ODG HOROES28G HT8OEH8G
    ODAROEOK OEPODZG OGSCG OHTOHAR OSCPOE2 P8AESOR PGDC8G PODAN
    POECC8ARAL POEHCSOE PSAROE RTCAE8 TCIROR TETPSCCG TOEDCCCG

  I removed these words and tried again:
  
    cat bio-j-evt-gut-1-unp.sugg \
      | sort -u \
      | bool 1-2 j-gut.dic - \
      > bio-j-evt-cln-1.dic
      
    cat bio-j-evt-cln-1.dic \
      | nice MaintainAutomaton \
          -add - \
          -dump bio-j-evt-cln-1.dmp  

     strings  letters    states   finals     arcs  sub-sts lets/arc
    -------- --------  -------- -------- -------- -------- --------
        1054     5237       365       87     1176      950    4.453

    nice AutoAnalysis \
      -load bio-j-evt-cln-1.dmp \
      -unprod bio-j-evt-cln-1-unp.sts \
        -maxUnprod 1 \
        -unprodSugg bio-j-evt-cln-1-unp.sugg

        4 unproductive states
        4 strange words (with repetitions) listed
        3 strange words (without repetitions) listed

      42AN HSCODCC8G CPTG

  I removed these and tried again:

    cat bio-j-evt-cln-1-unp.sugg \
      | sort -u \
      | bool 1-2 bio-j-evt-cln-1.dic - \
      > bio-j-evt-cln-2.dic
      
    cat bio-j-evt-cln-2.dic \
      | nice MaintainAutomaton \
          -add - \
          -dump bio-j-evt-cln-2.dmp  

     strings  letters    states   finals     arcs  sub-sts lets/arc
    -------- --------  -------- -------- -------- -------- --------
        1051     5220       360       87     1167      944    4.473

    nice AutoAnalysis \
      -load bio-j-evt-cln-2.dmp \
      -unprod bio-j-evt-cln-2-unp.sts \
        -maxUnprod 1 \
        -unprodSugg bio-j-evt-cln-2-unp.sugg

        0 unproductive states
        0 strange words (with repetitions) listed
        0 strange words (without repetitions) listed

  I recoded it into the "super-analyitic" encoding,
  but this time treating `qj', `qg', `lj', `lg' as single letters
  (`h', `k', `f', `p' respectively):
  
    cat bio-j-evt.wds \
      | fsg2jsa \
      | jsa2hoc \
      > bio-j-hoc.wds

    cat bio-j-hoc.wds | sort | uniq > bio-j-hoc.dic
      
    cat bio-j-hoc.wds | sort | uniq -c | sort +0 -1nr > bio-j-hoc.frq

     lines   words     bytes file        
    ------ ------- --------- ------------
      7216    7216     56394 bio-j-hoc.wds
      1761    1761     17523 bio-j-hoc.dic

  Next I separated the good words:
  
    cat bio-j-hoc.wds \
      | egrep '^[a-z+^]*$' \
      > bio-j-hoc-gut.wds
  
    cat bio-j-hoc.dic \
      | egrep '^[a-z+^]*$' \
      > bio-j-hoc-gut.dic
      
    bool 1-2 bio-j-hoc.dic bio-j-hoc-gut.dic \
      > bio-j-hoc-bad.dic

     lines   words     bytes file        
    ------ ------- --------- ------------
      5427    5427     44172 bio-j-hoc-gut.wds
      1083    1083      9958 bio-j-hoc-gut.dic
       678     678      7565 bio-j-hoc-bad.dic
 
  Next I buit the automaton:
  
    cat bio-j-hoc-gut.dic \
      | nice MaintainAutomaton \
          -add - \
          -dump bio-j-hoc-gut.dmp

     strings  letters    states   finals     arcs  sub-sts lets/arc
    -------- --------  -------- -------- -------- -------- --------
        1083     8875       701       91     1492     1258    5.948
        
  Digraph statistics:
  
    cat bio-j-hoc-gut.wds \
      | count-digraph-freqs 
      
    Digraph counts:

              |     i     o     c     t     q     f     p     h     k |     v     j     x      s    y     g   TOT
        ----- + ----- ----- ----- ----- ----- ----- ----- ----- ----- + ----- ----- -----  ---------- ----- -----
            . |   396  1146  2210     .  1398    94     1   112    70 |     .     .     .      .    .     .  5427
        ----- + ----- ----- ----- ----- ----- ----- ----- ----- ----- + ----- ----- -----  ---------- ----- -----
      i     4 |  2248     2     8     .     .     2     .     .     . |   497    40  1979    650    .     .  5430
      o    19 |  1371     1    69     .     5  1190     8   455    60 |     .     .     .      .    .     .  3178
      c     1 |  1367   150  3487  1201     .   245     2   134    25 |     .     .     .   1187 3301  2408 13508
      t     4 |    17    73  2320     .     .    14     1    10     3 |     .     .     .      .    .     .  2442
      q     1 |     .  1383    21     .     .     1     .     .     . |     .     .     .      .    .     .  1406
      f     6 |     5    47  1543   180     .     .     .     .     . |     .     .     .      .    .     .  1781
      p     . |     .     2    15     1     .     .     .     .     . |     .     .     .      .    .     .    18
      h     3 |     2    41   606   103     .     .     .     .     . |     .     .     .      .    .     .   755
      k     2 |     .    38   111    14     .     .     .     .     . |     .     .     .      .    .     .   165
        ----- + ----- ----- ----- ----- ----- ----- ----- ----- ----- + ----- ----- -----  ---------- ----- -----
      v   493 |     .     1     3     .     .     .     .     .     . |     .     .     .      .    .     .   497
      j    40 |     .     .     .     .     .     .     .     .     . |     .     .     .      .    .     .    40
      x  1101 |     5   116   545     .     1   183     4    18     6 |     .     .     .      .    .     .  1979
      s   540 |     1   128   222   943     .     2     .     .     1 |     .     .     .      .    .     .  1837
      y  3161 |    12     3    49     .     2    46     2    26     . |     .     .     .      .    .     .  3301
      g    52 |     6    47  2299     .     .     4     .     .     . |     .     .     .      .    .     .  2408
        ----- | ----- ----- ----- ----- ----- ----- ----- ----- ----- + ----- ----- -----  ---------- ----- -----
    TOT  5427 |  5430  3178 13508  2442  1406  1781    18   755   165 |   497    40  1979   1837 3301  2408 44172

  Again, it is obvious that `ix' `ij' `iv' are single letters; we will drop the `i' from them.
  Same goes for `cy' and `cg'; we will drop the `c'.
  
  We may also let `cs' and `is' be single letters.  This is the right
  thing to do if the distribution of the letter after the `s' depends
  on the letter before the s:
  
                  o     c     t     i     k     f    TOT
         ---------- ----- ----- ----- ----- -----  -----
      cs    45   86   109   943     1     1     2   1187
      is   495   42   113     .     .     .     .    650

      Next-symbol probabilities (× 99):

                   o     c     t     i     k     f   TOT
         ----- ----- ----- ----- ----- ----- ----- -----
      cs     4     7     9    79     .     .     .    99
      is    75     6    17     .     .     .     .    99

  They are similar except that cs if often follwed by `t' whereas
  `is' is often terminal and is never followed by `t'. (Not surprising
  since `t' only appears after `c' in this corpus.
  
  But OK, let's replace `cs' by `s' and `is' by `r':

    cat j.wds \
      | sed -f fsg2jsa.sed \
      > bio-j-hoc.wds

    cat bio-j-hoc.wds | sort | uniq > bio-j-hoc.dic
      
    cat bio-j-hoc.wds | sort | uniq -c | sort +0 -1nr > bio-j-hoc.frq
    
     lines   words     bytes file        
    ------ ------- --------- ------------
      7216    7216     44840 bio-j-hoc.wds
      1761    1761     13898 bio-j-hoc.dic
    
    cat bio-j-hoc.wds \
      | egrep '^[a-z67+^]*$' \
      > bio-j-hoc-gut.wds
  
    cat bio-j-hoc.dic \
      | egrep '^[a-z67+^]*$' \
      > bio-j-hoc-gut.dic
      
    bool 1-2 bio-j-hoc.dic bio-j-hoc-gut.dic \
      > bio-j-hoc-bad.dic
      
     lines   words     bytes file        
    ------ ------- --------- ------------
      5427    5427     34110 bio-j-hoc-gut.wds
      1083    1083      7646 bio-j-hoc-gut.dic
       678     678      6252 bio-j-hoc-bad.dic

  Digraph statistics:
  
    cat bio-j-hoc-gut.wds \
      | count-digraph-freqs 

    Digraph counts:

                  q     o     c     s     y     g     x     t     i     r     f     h     p     k     v     j   TOT
        ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- -----
            .  1398  1146   810   865   104   431   310     .     .    86    94   112     1    70     .     .  5427
      q     1     .  1383    18     2     1     .     .     .     .     .     1     .     .     .     .     .  1406
      o    19     5     1    40     8     3    18  1139     .    12   215  1190   455     8    60     .     5  3178
      c     1     .   150   964    36   731  1756     .  1201  1366     1   245   134     2    25     .     .  6612
      s    45     .    86    92    10     4     3     1   943     .     .     2     .     .     1     .     .  1187
      y  3161     2     3    17    23     .     9     7     .     .     4    46    26     2     .     .     1  3301
      g    52     .    47   403    35  1860     1     5     .     .     1     4     .     .     .     .     .  2408
      x  1101     1   116   262   126    98    59     3     .     .     2   183    18     4     6     .     .  1979
      t     4     .    73  1953     4   243   120    14     .     .     3    14    10     1     3     .     .  2442
      i     4     .     2     2     4     .     2   493     .   886   338     2     .     .     .   497    34  2264
      r   495     .    42    69    14    27     3     .     .     .     .     .     .     .     .     .     .   650
      f     6     .    47  1370    21   151     1     5   180     .     .     .     .     .     .     .     .  1781
      h     3     .    41   513    21    70     2     2   103     .     .     .     .     .     .     .     .   755
      p     .     .     2    14     1     .     .     .     1     .     .     .     .     .     .     .     .    18
      k     2     .    38    85    17     6     3     .    14     .     .     .     .     .     .     .     .   165
      v   493     .     1     .     .     3     .     .     .     .     .     .     .     .     .     .     .   497
      j    40     .     .     .     .     .     .     .     .     .     .     .     .     .     .     .     .    40
        ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- -----
    TOT  5427  1406  3178  6612  1187  3301  2408  1979  2442  2264   650  1781   755    18   165   497    40 34110

  There is something funny about the `t'.  I must try to either (a) identify it with `c',
  or (b) join it with the preceding `c' or `s' as a single letter.  Since most `t's
  have been misidentified as `c's, it is safer to do (a).