Last edited on 2009-01-31 17:29:58 by stolfi Processing the IAB-2002 fragment pictures 03 - Gathering labels and positions of image landmarks LANDMARKS /Landmarks/ are special points on an image. Landmarks include seed points for object segmentation, reference points for camera clalibration, centers of color calibration patches, bases of light gauges, etc.. LANDMARK ATTRIBUTES A landmark has three main attributes: the /label/, the /image position/, and the /world position/. * The label is a string consisting of one or more characters in [-A-Za-z0-9], beginning with [A-Za-z]. The label usually specifies the type of landmark, according to conventions to be established later. * The image position consists of four numbers --- the /image coordinates/ {H V D}, and the /image extent/ {RH RV}. The image coordinate system is that of "display" and many other tools: the upper left corner is {H = 0, V = 0} and the lower right corner is {H = NH, V = NV}, where {NH} and {NV} are the image dimensions in pixels. Note that a pixel is a square of unit side in this system; so that the center of a pixel in column {ih} and row {iv} (both integer counts from 0) has actually {H = ih+0.5, V = iv+0.5}. The fourth coordinate {D} is the depth, the distance from the camera after perspective map. It is measured in pixels relative to the nominal projection plane of the camera, so its units and origin are somewhat arbitrary. The parameter {RH} is half of the nominal width of the landmark on the image; and {RV} is half its nominal height. Both are measured in pixels. The rectangle {(H±RH)×(V±RV)} is the /nominal bounding box/ of the landmark. * The world position consists of three /world coordinates/ {X Y Z} and /world radius/ {RW}. These quantities should be measured in millimeters relative to a coordinate system attached to an object of the scene. Any of these parameters may be {NAN} (denoted by "?" in the master file) if it is meaningless or not known. The interpretation of {RH}, {RV}, and {RW} depends on the kind of landmark, as discussed below. For most landmarks, the the nominal bounding box {(H±RH)×(V±RV)} tightly encloses the landmark on the image. Likewise, for many landmarks {RW} is meant to be the radius of sphere centered at {X,Y,Z} that encloses that object in 3-space. Normally a landmark must have at least one known position, either {H, V} (with {D} possibly unknown) or {X, Y, Z}. Camera calibration points are useful only if they have both positions fully known. Sometimes both positions are unknown: this could be the case, for example, of the centroid of an object that is known to have been included in a photo but which is occluded, or has yet to be identified in it. If and when the camera is calibrated, it may be possible to compute the missing coordinates from the given ones. If the image position is known, the landmark may be shown as a labeled dot, ellipse, or box in diagnostic images. WORLD COORDINATE SYSTEM If the scene includes a /reference grid/ of some sort, the world coordinate system normally has the origin at the `bottom left' corner of the grid, with the X axis running along the grid's `bottom' side, the Y axis along the grid's `left' side, and the Z axis perpendicular to the grid's plane, oriented according to the right-hand convention. (The terms `bottom' and `left' in the above paragraph are usually relative to the camera's `bottom' and `left', but need not be.) LANDMARK TYPES Object centroids An object centroid give the approximate location of an object (e.g. a pot or fragment) in the image. It is used as a seed for segmentation and to position the fragment label in dignostic images. The image positions {H,V} should be specified. It helps if the extents {RH,RV} are specified, too. The depth {D}, the world coordinates {X,Y,Z}, and the world radius {RW} may be left unspecified. The label should be "{L}{NN}-{KKK}" where {L} is a capital letter, {NN} is a two-digit number, and {KKK} is a three-digit number. If {L} is "C", then {NN} identifies a bag of fragments from the Caju fragment colletion at IAB, and {KKK} is a serial fragment number within the batch, assigned at IAB and painted on the fragment itself, together with the IAB batch number. These numbers are listed in "coords/fragment-nums.txt". Labels that begin with "U" were assigned by J. Stolfi to fragments and whole objects that had no official label, such as un-numbered fragments and whole pieces photographed at the IAB main building. Labels that begin with "V" are temporary labels assigned randomly by J. Stolfi to fragments whose name could not be read. The same fragment thus may have a "C" or "U" label in some images, and one or more "V" labels in other images. Ideally the same fragment or object should get the same "C" or "U" label in all images where it appears. There are however many errors in the files. Fragment positions were collected manually (with the "xv" viewer) for most of the batches "stds", "mirr", "objs" and some of the "misc". Calibration points This kind of landmark is intended for camera calibration. Both the image coordinates {H,V} and the world coordinates {X,Y,Z} must be given. The depth {D} and the extents {RH,RV,RW} are irrelevant and may be set to "?" or 0. The label must have the form "grid{NN}" where {NN} is an arbitrary serial number. The extents {RH,RV,RW} are irrelevant and should be "?" or 0. Camera calibration requires at least four non-collinear points. It may require more points and/or points that are not coplanar. At least four points with {Z = 0} are needed to get a virtual reference grid in diagnostic images. Between 2003 and 2004, four reference points were collected for all images which contained a reference grid. The points were always nodes of the grid, so that {X,Y} were multiples of 10 mm, and {Z} was always 0. Because of shadows, occlusions, and accidental cropping, different nodes were chosen in each batch. An effort was made however to use the same nodes in all images that differed only in lighting, e.g. "TEC/stds/024/f1{a,b,c}.jpg". The Image coordinates {H,V} of the reference points were picked with "xv"'s eyedropper, often working on the reduced version (1:4) of each image. Some calibration points were collected on the geometric gauges (green monoliths) appearing on some images. The points "monoEX" and "monoWX" are the crosshairs on the top, for the East (left) and West (right) monoliths, respectively. Points "monoE{k}", "monoW{k}", for {k = A,B,C,D} are the top corners; for {k = a,b,c,d} are the corresponding corners at the bottom. The corners are extrapolated to the "ideal" positions (if the monoliths had sharp corners). {!!! Recover this date !!!} It is hard to determine which monolith is which in the pictures; chances are that they often switched places between sessions. For the time being, the leftmost one is "E", the other is "W". Similarly, it is hard to tell which corner is which; the labels "A,B,C,D" are assigned arbitrarily in each image. Upper corner of grid There should be one landmark of this kind for each image, with label "gridsz". Its image coordinates {H,V,D} and the extents {RH,RV,RW} may be unspecified but the world coordinates {X,Y,Z} must be {GX,GY,0} where {GX} and {GY} are the X and Y extents of the reference grid. This information is necessary to get a virtual reference grid in diagnostic images. The extents {RH,RV,RW} are irrelevant and should be "?" or 0. The grids used by helena at IAB had thin black lines spaced 10mm apart, so the dimensions {GX} and {GY} are multiples of 10. Gray/color calibration patches Each landmark of this type is the approximate centroid of a color or gray patch that could be used for photometric correction. The label of such a landmark should have the form "Y-{FR}-{MK}". The {FR} tag identifies a particular arrangement of patches (such as the Kodak Q-13 scale, or a particular version of the /stand/ used by Helena at IAB). The {MK} tag identifies one particular patch of that arrangement (such as a step of the Q-13 scale, or one of the gray disks attached to the stage). The image extents {RH,RV} should be those of an ellipse that is entirely INSIDE the image of the patch; so that one can use the average pixel color inside that ellipse for color calibration. The world radius {RW}, on the other hand, should ENCLOSE the patch. Light gauge positions A light gauge is an object added to a scene in order to record the local light flow. Every light gauge used here is a matte-white sphere with a short pedestal. The image coordinates {H,V} and the world coordinates {X,Y,Z}, when given, should be the center of the base of the pedestal. {!!! Revise this --- it should be the center of the ball !!!}. The world radius {RW} should by the radius of the ball, measured from the ball's center. OLD PROJECTIVE MAP FILE For each image "{I}.jpg" there was a file {I}.grmat correspondence {H,V} <--> {X,Y,0} This correspondence is a two-dimensional projective map and is represented in the file "{I}.grmat " as a {3 × 3} homogeneous matrix {M} and its inverse {N}. OLD DERIVED LANDMARK FILES Between 2002 and 2008 the labels and positions ({H,V}, sometimes also {X,Y,Z}) of all landmarks of each image were collected in five separate files, one file per type. The files for a sorted image named "{G}/{K}/{B}/{I}.jpg" were saved in the same directory, with the following names and contents: {I}.fctrs Centroids of fragments and other notable objects. {I}.balls Light gauge positions. {I}.grays Centroids of color/grayscale calibration patches. {I}.grpts Key points on the reference grid. {I}.grsz Dimensions of the reference grid. Each of these files was present only when the image "{I}.jpg" had at least one landmark of the corresponding type. The file "{I}.grsz" contained only two numbers, the dimensions IN CENTIMETERS of the reference grid, if any. If all images of a batch had the same reference grid, the files {I}.grsz of all those images could be replaced by a single file called "{G}/{K}/{B}/batch.grsz". Except for "{I}.grsz", all files above had the same format, namely one line per landmark with format "{H} {V} {X} {Y} {Z} {LABEL}" Note that the image depth {D} and the extents {RH,RV,RW} were not given in these files. OLD PROJECTIVE MAP FILE For each image "{I}.jpg" there was also a file {I}.grmat correspondence {H,V} <--> {X,Y,0} This correspondence is a two-dimensional projective map and is represented in the file "{I}.grmat " as a {3 × 3} homogeneous matrix {M} and its inverse {N}. OLD MASTER LANDMARK FILES AND SCRIPTS The derived landmark files were previously created by separate scripts acting on separate 'master' data files in the "coords" directory: tools/write-gridsizes gridsizes.txt tools/write-gridpoints gridpoints.txt, monopoints.txt tools/write-fragcenters fragcenters.txt tools/write-graycenters graycenters.txt tools/write-lightballs lightballs.txt {!!! We should unify all these files and scripts. !!!} Note that the depth {D} and extents {RH,RV,RW} were not given in these files. OLD LANDMARK DIAGNOSTIC IMAGES Previously we had one separate diagnostic image for each kind of landmark: {I}-n.jpg Shows the fragment centers. {I}-b.jpg Shows the light balls. {I}-y.jpg Shows the gray patches. {I}-g.jpg Shows the calibration points and the reference grid. In all these images, each landmark is shown as a colored dot with the label written next to it. The image "{I}-g.jpg" also shows the reference grid, drawn over the image, with the nodes and lines positioned as implied by the calibration points. {!!! We must unify these images. !!!} To produce these images, the sripts created temporary files "{I}.fcmds", "{I}.gcmds", "{I}.ycmds", "{I}.bcmds" holding graphics commands for the "ImageMagick" tool, used with its "-draw" option. These files were usually deleted after use. NEW PROJECTIVE MAP FILE For each image "{I}.jpg" there will be a file {I}.grmat correspondence {H,V,D} <--> {X,Y,Z} This correspondence is a three-dimensional projective map and is represented in the file "{I}.grmat " as a {4 × 4} homogeneous matrix {M} and its inverse {N}. NEW MASTER LANDMARK FILE In the new landmark master data file, each landmark should be written as a line "! {H} {V} {D} {RH} {RV} {X} {Y} {Z} {RW} {LABEL}" The quantities may be numbers or "?" for "meaningless" or "unknown" or "@" for "same as in previous images". The image-related quantities {H,V,D,RH,RV} will be internally scaled by a magnification factor provided by "mag = {N}" lines; this feature is meant to simplify the gathering of landmarks from browse-sized versions of the images. The full format of the file is explained in the script "expand-master-landmark-file". {!!! Adapt from convert-objcenters !!!} NEW DERIVED LANDMARK FILES From the new master landmark file we will generates a single file "{G}/{K}/{B}/{I}.marks" for each image "{G}/{K}/{B}/{I}.jpg", containing all the landmarks of that file. Each line of this file has the format "! {H} {V} {D} {RH} {RV} {X} {Y} {Z} {RW} {LABEL}" where, as before, {H,V,D,RH,RV} are in pixels, scaled to match the full-resolution raw image; and {X,Y,Z,RW} are in millimeters. These files replace the old landmark files "{I}.fctrs", "{I}.balls", "{I}.grays", "{I}.grpts", "{I}.grsz". NEW LANDMARK IMAGES Likewise, for each raw image there will be a single diagnostic image "G/{K}/{B}/{I}-m.jpg" showing all the associated landmarks. These files replace the old landmark diagnostic images "{I}-n", "{I}-b", "{I}-g", "{I}-y". ====================================================================== %% NEVER DONE REFINING THE LANDMARKS The first step to refine the landmarks was to extract a small subimage (64 by 64 pixels) surrounding each landmark. This is done by the script extract-refpoint-nbhoods: extract-gridpoints-nbhoods-all (Should find the best possible match of a transformed background grid and the segmented background found in step 1.)