Last edited on 2009-01-31 17:29:58 by stolfi

Processing the IAB-2002 fragment pictures
03 - Gathering labels and positions of image landmarks

LANDMARKS

  /Landmarks/ are special points on an image.  Landmarks 
  include seed points for object segmentation, reference points 
  for camera clalibration, centers of color calibration patches,
  bases of light gauges, etc..
  
LANDMARK ATTRIBUTES
  
  A landmark has three main attributes: the /label/, the /image
  position/, and the /world position/.
  
  * The label is a string consisting of one or more characters in [-A-Za-z0-9],
    beginning with [A-Za-z].  The label usually specifies the type 
    of landmark, according to conventions to be established later.
  
  * The image position consists of four numbers --- the /image
    coordinates/ {H V D}, and the /image extent/ {RH RV}. The image
    coordinate system is that of "display" and many other tools: the
    upper left corner is {H = 0, V = 0} and the lower right corner is
    {H = NH, V = NV}, where {NH} and {NV} are the image dimensions in
    pixels. Note that a pixel is a square of unit side in this system;
    so that the center of a pixel in column {ih} and row {iv} (both
    integer counts from 0) has actually {H = ih+0.5, V = iv+0.5}.
    
    The fourth coordinate {D} is the depth, the distance from the
    camera after perspective map. It is measured in pixels relative to
    the nominal projection plane of the camera, so its units and
    origin are somewhat arbitrary.
    
    The parameter {RH} is half of the nominal width of the 
    landmark on the image; and {RV} is half its nominal
    height.  Both are measured in pixels.   The rectangle
    {(H±RH)×(V±RV)} is the /nominal bounding box/ of the
    landmark.

  * The world position consists of three /world coordinates/
    {X Y Z} and /world radius/ {RW}.  These quantities
    should be measured in millimeters relative to
    a coordinate system attached to an object of the scene.

  Any of these parameters may be {NAN} (denoted by "?" in the master
  file) if it is meaningless or not known. The interpretation of {RH},
  {RV}, and {RW} depends on the kind of landmark, as discussed below.
  For most landmarks, the the nominal bounding box {(H±RH)×(V±RV)}
  tightly encloses the landmark on the image. Likewise, for many
  landmarks {RW} is meant to be the radius of sphere centered at
  {X,Y,Z} that encloses that object in 3-space.
  
  Normally a landmark must have at least one known position, either
  {H, V} (with {D} possibly unknown) or {X, Y, Z}. Camera calibration
  points are useful only if they have both positions fully known.
  Sometimes both positions are unknown: this could be the case, for
  example, of the centroid of an object that is known to have been
  included in a photo but which is occluded, or has yet to be identified in it.
  
  If and when the camera is calibrated, it may be possible to compute
  the missing coordinates from the given ones. If the image position
  is known, the landmark may be shown as a labeled dot, ellipse, or
  box in diagnostic images.
  
WORLD COORDINATE SYSTEM

  If the scene includes a /reference grid/ of some sort, the world
  coordinate system normally has the origin at the `bottom left'
  corner of the grid, with the X axis running along the grid's
  `bottom' side, the Y axis along the grid's `left' side, and the Z
  axis perpendicular to the grid's plane, oriented according to the
  right-hand convention. (The terms `bottom' and `left' in the above
  paragraph are usually relative to the camera's `bottom' and `left',
  but need not be.)
  
LANDMARK TYPES

  Object centroids

    An object centroid give the approximate location of an 
    object (e.g. a pot or fragment) in the image.  It is
    used as a seed for segmentation and to position the fragment label in
    dignostic images.
    
    The image positions {H,V} should be specified. It helps if the extents {RH,RV}
    are specified, too.  The depth {D}, the world coordinates
    {X,Y,Z}, and the world radius {RW} may be left unspecified.

    The label should be "{L}{NN}-{KKK}" where {L} is a capital letter,
    {NN} is a two-digit number, and {KKK} is a three-digit number.
    
    If {L} is "C", then {NN} identifies a bag of fragments from the
    Caju fragment colletion at IAB, and {KKK} is a serial fragment
    number within the batch, assigned at IAB and painted on the
    fragment itself, together with the IAB batch number. These numbers
    are listed in "coords/fragment-nums.txt".

    Labels that begin with "U" were assigned by J. Stolfi to fragments
    and whole objects that had no official label, such as un-numbered
    fragments and whole pieces photographed at the IAB main building.

    Labels that begin with "V" are temporary labels assigned randomly by
    J. Stolfi to fragments whose name could not be read. The same
    fragment thus may have a "C" or "U" label in some images, and one or
    more "V" labels in other images.

    Ideally the same fragment or object should get the same "C" or "U"
    label in all images where it appears. There are however many errors
    in the files.

    Fragment positions were collected manually (with the "xv" viewer)
    for most of the batches "stds", "mirr", "objs" and some of the
    "misc".
    
  Calibration points
  
    This kind of landmark is intended for camera calibration. Both the
    image coordinates {H,V} and the world coordinates {X,Y,Z} must be
    given. The depth {D} and the extents {RH,RV,RW} are irrelevant
    and may be set to "?" or 0.
    
    The label must have the form "grid{NN}" where {NN} is an
    arbitrary serial number.  The extents {RH,RV,RW} are irrelevant
    and should be "?" or 0.
    
    Camera calibration requires at least four non-collinear
    points.  It may require more points and/or points that
    are not coplanar.  At least four points with {Z = 0}
    are needed to get a virtual reference grid in diagnostic images.
    
    Between 2003 and 2004, four reference points were collected for
    all images which contained a reference grid. The points were
    always nodes of the grid, so that {X,Y} were multiples of 10 mm,
    and {Z} was always 0. Because of shadows, occlusions, and
    accidental cropping, different nodes were chosen in each batch. An
    effort was made however to use the same nodes in all images that
    differed only in lighting, e.g. "TEC/stds/024/f1{a,b,c}.jpg".

    The Image coordinates {H,V} of the reference points were picked
    with "xv"'s eyedropper, often working on the reduced version (1:4)
    of each image.

    Some calibration points were collected on the geometric gauges
    (green monoliths) appearing on some images. The points "monoEX"
    and "monoWX" are the crosshairs on the top, for the East (left)
    and West (right) monoliths, respectively. Points "monoE{k}",
    "monoW{k}", for {k = A,B,C,D} are the top corners; for {k =
    a,b,c,d} are the corresponding corners at the bottom. The corners
    are extrapolated to the "ideal" positions (if the monoliths had
    sharp corners). {!!! Recover this date !!!}

    It is hard to determine which monolith is which in the pictures;
    chances are that they often switched places between sessions. For
    the time being, the leftmost one is "E", the other is "W".
    Similarly, it is hard to tell which corner is which; the labels
    "A,B,C,D" are assigned arbitrarily in each image.

  Upper corner of grid
  
    There should be one landmark of this kind for each image, with
    label "gridsz". Its image coordinates {H,V,D} and the extents
    {RH,RV,RW} may be unspecified but the world coordinates {X,Y,Z}
    must be {GX,GY,0} where {GX} and {GY} are the X and Y extents of
    the reference grid. This information is necessary to get a virtual
    reference grid in diagnostic images. The extents {RH,RV,RW} are
    irrelevant and should be "?" or 0.
    
    The grids used by helena at IAB had thin black lines spaced 10mm
    apart, so the dimensions {GX} and {GY} are multiples of 10.
    
  Gray/color calibration patches
  
    Each landmark of this type is the approximate centroid of 
    a color or gray patch that could be used for photometric
    correction.  
    
    The label of such a landmark should have the form "Y-{FR}-{MK}".
    The {FR} tag identifies a particular arrangement of patches (such
    as the Kodak Q-13 scale, or a particular version of the /stand/
    used by Helena at IAB). The {MK} tag identifies one particular
    patch of that arrangement (such as a step of the Q-13 scale, or
    one of the gray disks attached to the stage).
    
    The image extents {RH,RV} should be those of an ellipse that is entirely
    INSIDE the image of the patch; so that one can use the average
    pixel color inside that ellipse for color calibration. The world
    radius {RW}, on the other hand, should ENCLOSE the patch.
    
  Light gauge positions
  
    A light gauge is an object added to a scene in order to record the
    local light flow. Every light gauge used here is a matte-white
    sphere with a short pedestal.
    
    The image coordinates {H,V} and the world coordinates {X,Y,Z},
    when given, should be the center of the base of the pedestal. {!!!
    Revise this --- it should be the center of the ball !!!}.
    The world radius {RW} should by the radius of the ball, measured
    from the ball's center.
    
OLD PROJECTIVE MAP FILE

  For each image "{I}.jpg" there was a file
  
    {I}.grmat    correspondence {H,V} <--> {X,Y,0}
    
  This correspondence is a two-dimensional projective map
  and is represented in the file "{I}.grmat " as a {3 × 3} 
  homogeneous matrix {M} and its inverse {N}.

OLD DERIVED LANDMARK FILES

  Between 2002 and 2008 the labels and positions ({H,V}, sometimes
  also {X,Y,Z}) of all landmarks of each image were collected in five
  separate files, one file per type. The files for a sorted image
  named "{G}/{K}/{B}/{I}.jpg" were saved in the same directory, with the
  following names and contents:
  
    {I}.fctrs    Centroids of fragments and other notable objects.
    {I}.balls    Light gauge positions.
    {I}.grays    Centroids of color/grayscale calibration patches.
    {I}.grpts    Key points on the reference grid.
    {I}.grsz     Dimensions of the reference grid.
    
  Each of these files was present only when the image "{I}.jpg"
  had at least one landmark of the corresponding type.
  
  The file "{I}.grsz" contained only two numbers, the dimensions IN
  CENTIMETERS of the reference grid, if any. If all images of a batch
  had the same reference grid, the files {I}.grsz of all those images
  could be replaced by a single file called "{G}/{K}/{B}/batch.grsz".
  
  Except for "{I}.grsz", all files above had the same format, namely
  one line per landmark with format
  
    "{H} {V}  {X} {Y} {Z} {LABEL}"
    
  Note that the image depth {D} and the extents {RH,RV,RW} were not given in these files.
   
OLD PROJECTIVE MAP FILE

  For each image "{I}.jpg" there was also a file
  
    {I}.grmat    correspondence {H,V} <--> {X,Y,0}
    
  This correspondence is a two-dimensional projective map
  and is represented in the file "{I}.grmat " as a {3 × 3} 
  homogeneous matrix {M} and its inverse {N}.

OLD MASTER LANDMARK FILES AND SCRIPTS
  
  The derived landmark files were previously created by separate scripts
  acting on separate 'master' data files in the "coords" directory:
  
    tools/write-gridsizes    gridsizes.txt

    tools/write-gridpoints   gridpoints.txt, monopoints.txt
    
    tools/write-fragcenters  fragcenters.txt 
    
    tools/write-graycenters  graycenters.txt
  
    tools/write-lightballs   lightballs.txt

  {!!! We should unify all these files and scripts. !!!}
  
  Note that the depth {D} and extents {RH,RV,RW} were not given in these files.
   
OLD LANDMARK DIAGNOSTIC IMAGES 

  Previously we had one separate diagnostic image for each kind of landmark:
  
     {I}-n.jpg    Shows the fragment centers.
     {I}-b.jpg    Shows the light balls.
     {I}-y.jpg    Shows the gray patches.
     {I}-g.jpg    Shows the calibration points and the reference grid.
     
  In all these images, each landmark is shown as a colored dot
  with the label written next to it.  The image "{I}-g.jpg" also
  shows the reference grid, drawn over the image, with the nodes
  and lines positioned as implied by the calibration points. 
  
  {!!! We must unify these images. !!!}
  
  To produce these images, the sripts created
  temporary files "{I}.fcmds", "{I}.gcmds", "{I}.ycmds", "{I}.bcmds"
  holding graphics commands for the "ImageMagick" tool, used
  with its "-draw" option.  These files were usually deleted 
  after use.

NEW PROJECTIVE MAP FILE

  For each image "{I}.jpg" there will be a file
  
    {I}.grmat    correspondence {H,V,D} <--> {X,Y,Z}
    
  This correspondence is a three-dimensional projective map
  and is represented in the file "{I}.grmat " as a {4 × 4} 
  homogeneous matrix {M} and its inverse {N}.  

NEW MASTER LANDMARK FILE

  In the new landmark master data file, each landmark should be written
  as a line

    "! {H} {V} {D}  {RH} {RV}   {X} {Y} {Z} {RW} {LABEL}"

  The quantities may be numbers or "?" for "meaningless" or "unknown"
  or "@" for "same as in previous images". The image-related quantities
  {H,V,D,RH,RV} will be internally scaled by a magnification factor provided
  by "mag = {N}" lines; this feature is meant to simplify the
  gathering of landmarks from browse-sized versions of the images.

  The full format of the file is explained in the script 
  "expand-master-landmark-file". {!!! Adapt from convert-objcenters !!!}
  
NEW DERIVED LANDMARK FILES

  From the new master landmark file we will generates a single
  file "{G}/{K}/{B}/{I}.marks" for each image "{G}/{K}/{B}/{I}.jpg",
  containing all the landmarks of that file.
  
  Each line of this file has the format
  
    "! {H} {V} {D} {RH} {RV}   {X} {Y} {Z} {RW}  {LABEL}"
    
  where, as before, {H,V,D,RH,RV} are in pixels, scaled to match
  the full-resolution raw image; and {X,Y,Z,RW} are in
  millimeters.
    
  These files replace the old landmark files "{I}.fctrs", "{I}.balls",
  "{I}.grays", "{I}.grpts", "{I}.grsz".
  
NEW LANDMARK IMAGES

  Likewise, for each raw image there will be a single diagnostic
  image "G/{K}/{B}/{I}-m.jpg" showing all the associated landmarks.  
  
  These files replace the old landmark diagnostic images "{I}-n",
  "{I}-b", "{I}-g", "{I}-y".
  
======================================================================
%% NEVER DONE   
    
REFINING THE LANDMARKS

  The first step to refine the landmarks was to extract a small subimage 
  (64 by 64 pixels) surrounding each landmark.  This is done by the script
  extract-refpoint-nbhoods:
  
    extract-gridpoints-nbhoods-all
  
  (Should find the best possible match of a transformed background grid 
  and the segmented background found in step 1.)