Locating parag breaks in the Starred Parags Section

Index

Summary

This webpage reports an attemp at locating the paragraph breaks in the Starred Paragraphs section (SPS) of the Voynich Manuscript (VMS), also known as the Recipes section.

This page has a general discussion of the problem and methodology. For the detailed page-by-page report, see this sub-report.

Paragraphs and why they matter

The text of the SPS clearly consists of multiple paragraphs ('parags' for short), each comprising one or more lines. Each line normally starts at the left margin of the text and continues up to the right margin, or to the end of the paragraph, whichever comes first.

Considering the cost of vellum and the difficulty of erasing it, we assume that the Author (the person who decided to create the book, devised the script, created or obtained the information, etc.) first wrote a draft the SPS on paper, and then recruited a Scribe to traspose the final draft to vellum. We assume that both Author and Scribe understood that the parags were significant, while the line breaks within each parag were not; so that the Scribe disregarded the line breaks in the draft, and introduced new ones whenever he/she reached the right margin. (While the Scribe may have been the Author him/herself, there is evidence that they were different persons, and that the Scribe did not know much about the meaning of text, probably not even its language.)

Identifying the paragraphs as intended by the Author is necessary for certain analyses, such as comparing the frequencies of words at in the start, middle, and end of paragraphs, or trying to identify words that occur preferably at the end of sentences (like the verbs in German and other SOV languages).

There are several features that are believed to indicate those breaks; but they often disagree or are absent, and there are several blocks of consecutive lines that, based on internal and external clues, can be presumed to consist of two or more parags run together. Locating the paragraph breaks within those blocks requires some more or less arbitary choices, guided by the available clues.

This article describes a set of criteria for guiding these choices, and shows the results of applying them to each page of the SPS.

Glossary of general terms

The transcription file

Marking the parags

Ultimately the chosen parag breaks were marked manually in the transcription file, by the prefix "<%>" on head line and the suffix "<$>" on the tail line, as explained in this sub-document. On each head line we also added by hand the the star assignment comments "<!SNN>" and "<!NoS>" The comments "<!WGP>" and "<!WGN>" were used to mark wide linegaps, whether they were chosen parag breaks or not.

A program was then used to evaluate each of these marked parags according to the criteria described below. The results are reported in this sub-document.

Stars in the SPS

The left margin of every page of the SPS contains a column of stars. For the purposes of this report, the stars in each page are designated S01, S02, etc., from the top down.

  • Glossary of star-related terms

  • Appearance

    The stars of the SPS look very similar to those in the Zodiac section, and some of those in the Cosmo section. Each star is typically ~6 mm across. Its outline is drawn with the same ink as the text, and consists of 6 to 9 rays, typically ~1.5 mmm wide at the base and ~2 mm long, with two straight or slightly curved sizes, that form a sharp or slightly rounded point. The bases of the rays define a round body ~2mm across.

    The number of rays may carry some information, but there is no evidence for that.

    Colors

    A few stars are clear -- just outlined, not painted. All the others are partially or totally painted with one of two colors: yel, a partly transparent watercolor-like golden yellow paint (apparently the same paint/ink used on the hair of the nymphs of the Zodiac section), or red, an opaque tempera-like dark red paint (presumably the same used for the lips of nymphs in the Zodiac, flowers in the Herbal section, etc.)

    The cores of stars that are painted red is usually invisible, while that of those painted yel is normally visible through the paint.

    The two colors may have been applied at different times by different people (the Light Painter and the Dark Painter, respectively), who may have had different levels of knowledge about the VMS. Thus it is possible that the yel color on stars has some information, for instance about parag breaks, while the red color has not.

    On most pages the colors red and yel aternate. Sometimes, as in f104r and f115r, there are hiccups where a color is repeated. On some pages, like f108r, the colors seem to be random, and several stars are neither red nor yel (left unpainted or maybe painted white). Since the colors may well be a later addition, we just ignored them when choosing the parag breaks.

    Tails

    Most stars in the SPS have a tail, which is a thin curvy line extending down from the star's outline.

    The tail usually extends or starts from the tip of a ray near 05:00. In this case the ray is often sharper, longer, and curved. (This indicates that the tail was drawn by the same Scribe who drew the stars.) But sometimes the tail starts at the notch between two rays.

    On some stars the ray is extended to most of the length of the tail, thus creating a fat tail. The fat tails may have some information too. For instance, a fat tail may signify that the star is associated with two parag heads that are too close together to receive individual stars.

    However, there doesn't seem to be a sharp distinction between fat and thin tails. The difference is only how far the fat part of the tail extends.

    Drawing order

    While the finished outline of each star is a single continuous line, it may be drawn in two or more separate strokes.

    In some thin-tailed stars, the outline was drawn first and the tail was added as a separate stroke. In other cases the drawing of the outline apparently started at a ray tip around 05:00, went CW all around, then instead of stopping at the starting point continued out to form the (single) tail.

    In the case of stars with fat tails, sometimes the star outline is drawn as two or more strokes in opposite directions, starting at a ray tip in the general NW area, until a ray tip around around 05:00, and each stroke is then extended to make one edge of the tail. See S13 in f113r for example.

    Other times a star with fat tail is drawn by drawing a thin-tail star first (in either of the pssible ways), and then adding the left edge of the tail as a separate stroke. See the tail of S06 in f114v for example.

    Line assignment and starlets

    Presumably, the intention of the Author or Scribe was to have one star for each parag of the SPS, aligned with the parag's head line, like the bullet in an item of an itemized list.

    However the reality is somewhat far from this ideal. A star that is presumably associated with an obvious parag is often located at some distance, up or down, from the head, or is missing entirely. Part of the task of identifying the paragraph breaks is to assign each star to a text line, in a one-to-one way; and deciding which text lines (and, in some cases, which parags) will be left without stars.

    Easy cases: Perfect parags

    Page boundaries and titles

    For reasons detailed below, we assumed that each parag is wholly contained in a single page. That is, page boundaries are hard parag boundaries.

    The SPS contains three short centered or right-justified lines that are generally assumed to be section titles or some such. See this document. We assume that those titles do not belong to any parag, so they are themselves hard parag boundary markers too.

    It is possible that there are other section headers were not recognized as such. We have no way to check for that; such headers, if they exist, have been included in parags.

    Parag properties

    To identify a set of one or more consecutive text lines, not including any "titles", as a paragraph, We used some or all of the following criteria:

    Note that P4 does not *require* the existence of puffs in the head line of a perfect parag. It only prohibits them in the other lines.

    Rules P1 and P2 effectively require that a perfect parag be entirely contained within one page.

    Perfect parags

    A set of consecutive non-title lines that satisfies all conditions P1-P9 will be called a perfect parag.

    By the above criteria, one can identify ??? perfect parags in the SPS, covering ??? or the ??? text lines. When those are excluded, the lines that remain ???

    Quasi-perfect parags

    If a set of consecutive lines satisfies all the conditions P1-P9 except P4, (that is, if some line other than the first one contains puffs) we will call those lines a quasi-perfec parag.

    There are ??? parags that are not perfect but quasi-perfect.

    Pluperfect parags

    Two other conditions that are relevant for parag splitting are:

    Note that condition Q1 is a stronger version of P1. If a set of lines is a perfect parag and satisfies either of these two criteria, we call it a pluperfect parag.

    Hard cases: Imperfect parags

    Possible causes

    While most of the text can be parsed as perfect parags, there are several 'imperfect blocks' of consecutive lines such that, in any block, any candidate parag fails at least one of the criteria P1-P8. Thus, within each of those imperfect blocks we had to determine where to break parags by less objective criteria.

    As discussed above, stars on the right margin do not seem to be reliable parag markers. they are often more than one line off from the head, or missing entirely. There are many possible scenarios for the creation and final scribing of the SMS that would have led to starlets being omitted or misaligned by mistake.

    Another possible cause for dubious parag breaks is the Scribe sometimes starting a new parag on the same line as the tail of the previous parag, when the latter was a short line. For example, on <f105v.P1.6> the parag break should perhaps be after the first word {dcheo}. Likewise, on <f105v.P1.13>, the parag break should perhaps be after the first word {saiin}.

    Splitting the imperfect blocks

    To split imperfect blocks, we set a 'definitive' parag break after every short line, even if the next line cannot get a starlet assigned to it. That line will be the start of an 'unstarred' parag.

    We also put a 'tentative' parg break before any line that has at least one puff, even if the previous line is not short and no starlet can be assigned to it.

    Those two decisions divide each imperfect block into 'tentative parags'. Each tentative parags is a set of consecutive lines such that no line except perhaps the first has any puffs or assigned starlet, and no line except perhaps the last one is short-length.

    The insertion of further parag breaks was guided by the above criteria. The detailed reports are in this sub-report.

    Links


    Last edited on 2026-02-09 15:28:10 by stolfi