#! /usr/bin/python3 # Last edited on 2025-10-29 15:44:41 by stolfi import sys, re import html_gen as h def main(): wr = sys.stdout title = "Separating inks and paints
by Bayesian classification:
Principles and basic test" st = h.preamble(wr, title, "960px")

Separating inks and paints
by Bayesian classification:
Principles and basic test

Test image and provinces of interest

As a first test of the idea, we use the following clip of page f79r from the "Biological" or "Balneological" section, cropped crop 800x1024+63+2385. It covers the green-painted pool at the SW corner of the page. (Click on thumbnails for the full-size images.)

For this first test, we consider three provinces, pairwise disjoint regions of the page with distinct natures, which should have relatively homogeneous appearance:

For reasons discussed below, we cannot expect each province to have a single uniform color, but rather a characteritic gamut of colors. So, from each of these provinces, we manually pich a sample of pixels which we believe to be representative of its color gamut. These samples are conveniently specified by the black-and white masks. For clarity, the sampled pixels of the input image are also shown, below each mask.

parch.png green.png dkink.png

Color samples as point clouds in 3-space

Each color taken from the input image can be visualized as a point of the RGB color cube, whose corners are the eight "basic" RGB colors: (0,0,0) for black, (1,1,1) for white, (1,0,0) for red, (1,1,0) for yellow, etc.. A sample of colors is a cloud of points in this cube.

The images below are three different views of the three clouds corresponding to the three samples above. The colors of the points in these pictures are not the colors themselves; they are arbitrarily assigned to identify the samples -- red for the parch sample, blue for the green sample, and cyan for the dkink sample. (For clarity, only a subset of ~1000 points from each cloud are plotted.)

It can be seen in these snapshots that the clouds are distinctly not spherical. The causes for the spread include:

Modeling and inverting the color distributions

Those point clouds were approximated by trivariate Gaussian probability density functions (PDFs). Think of each distribution as a fuzzy ellipsoid with three unequal axes, with generic position and orientation. These, in particular, have the longest axis pointing roughly (but not exactly) towards the black point (0,0,0).

Each distribution is a mathematical model that gives the probability Pr(color(p)=C | p∈S) that a pixel p will have color C, assuming that it lies on province S (parch, green, or dkink) from which the corresponding set of color samples was taken.

What we want instead is Pr(p∈S | color(p)=C), which is the probability that a pixel p belongs to province S, given that its color is C. For that we use Bayes's formula:

Pr(p∈S | color(p)=C) = K Pr(p∈S) Pr(color(p)=C | p∈S)

where Pr(p∈S) is the prior probabiliy of pixel p belonging to province S -- that is, the probability we assign to that fact before we are given its color C; and K is a constant such that the sum of the left-hand side over all provinces S is equal to 1.

To use this formula we must consider one extra province, OTHER, that is all parts of the input image that are not included in any of the provinces of interest.

For each prior probability Pr(p∈S) we could use the fraction of the area ofthe image that belongs to province S, if that number is known. In practice it is not available, because it depend on assigning each pixel to one of the provinces -- which precisely the goal of this analysis. Thus we set Pr(p∈OTHER) to some arbitray value P0, and for each chosen province S we set Pr(p∈S) to (1-P0)/m where m = 3 is the number of chosen provinces.

The following images show the result of this analysis. For each of the tree provinces plus OTHER, we get a grayscale image whose value at some pixel p is the probability of p belonging tho that province, based on its color. To make the images more intuitive, the image for parch is shown as computed (probability 1 = white) while the others are inverted (probability 1 = black).

parch green dkink OTHER

Discussion

The results of this analysis are mostly expected, with a few exceptions. Let's look closely at the OTHER probability map:

Apart from these specific cases, the OTHER map has scattered pixels along the edges of the ink strokes, both on the text (in the northeast quadrant) and on the drawing (in the water spout at nothwest and its water stream, and on the face and arms of the nymph). The color of these pixels is a mix of the full-ink color and the parchment color, and thus are excluded from both of those two categories.

It seems that the green paint (only) has a bluish green component that can cross the parchment. This bluish-green pigment seems to be used almost alone in some pages, such as f8r. Here (and in the rest of the Bio section) it seems to be mixed with a yellowish or ocher pigment that does not bleed, resulting in a darker and more yellowish green. The McCrone report says that the green pigment (without distinguishing the two) is a "resinate of copper". But that is awfully vague. Could it be that it is not a tempera (guache) paint, but a fatty copper dye dissolved in turpentine?


Last edited on 2025-10-24 05:01:45 by stolfi