# Last edited on 2007-04-10 18:21:48 by stolfi Project: creating large image databases (with > 10^4 images) to test image processing algorithms, such as QBIC, classification, compression, etc.. See the file 00-README for a list of subdirectories. DATABASE STRUCTURE An image database is a directory ${dir} containing the following files: ${dir}/{AAA}/{BBB}/{CCC}/{DDD}.{EXT} The image files. Each of {AAA}, {BBB}, {CCC}, {DDD} is a decimal number between 000 and 999; there may be gaps in the numbering, but there must be only one image file for each combination {AAA}/{BBB}/{CCC}/{DDD}. The extension {EXT} is "jpg", "tif", "png", etc. (excluding "txt"). ${dir}/{AAA}/{BBB}/{CCC}/{DDD}.txt A text file that describes the companion image file ${dir}/{AAA}/{BBB}/{CCC}/{DDD}.{EXT}. See below for details. ${dir}/Notebook.txt A text file that describes how the other files in ${dir} were created. FORMAT OF IMAGE DESCRIPTION FILES Each documentation file "{AAA}/{BBB}/{CCC}/{DDD}.txt" contains a number of data fields. Each data field has the format "{KEY}: {VALUES}" Important fields are "type: {TYP}" image type ("JPG", "TIF", "PPM", etc.). "class: {STAT}" file class ("F" plain, "L" symlink, "U" unreadable). "compr: {CMP}" file compression ("Z", "gz", etc.; omitted or "-" for none). "bytes: {N}" size in bytes of (compressed) image file. "size: {NX} {NY}" image width and height in pixels. "url: {URL}" URL of source image, mime-quoted (' ' = "%20", etc.) "source: {PATH}" pathname of source image. "frame: {FRAME}" frame or page index in source image. If the source image was rescaled, cropped, or padded, the following operation fields may also appear, in the order they were applied: "osize: {NX} {NY}" original size of source image. "scale: {NX} {NY}" image was rescaled to {NX} by {NY}. "crop: {NX} {NY} {DX} {DY}" image was cropped/padded. Some fields may be omitted if unknown or defaultable. The source image may differ from the stored image file by (preferably lossless) format conversion. In the "crop:" operation, the first two numbers {NX},{NY} are the width and height of the extracted image {E}, and the next two {DX},{DY} are the position of the upper left corner of {E} relative to that of the uncropped image {S}. These displacements may be negative, and assume that the Y axis points DOWN. Any part of {E} that is not contained in {S} has been padded in some unspecified way. RAW IMAGE DATABASES A raw image database has images of varying sizes, geenrally as obtained from external sources. Source files in uncompressed, obscure, obsolescent, or awkward formats (including externally compressed formats like "ppm.gz" or "bmp.Z") should be converted to PNG if possible. Multi-image files must be split into single images. In a finished raw database, all images must be actual readable files (file class "F"), not symlinks. VIRTUAL DATABASES Virtual databases are similar to raw ones, except that the files ${dir}/{AAA}/{BBB}/{CCC}/{DDD}.{EXT} are symbolic links (file class "L") rather than actual files. STANDARDIZED DATABASES Standardized databases are like raw databases, with the following added constraints: * All images are in PNG format. * All images have the same width and height. In order to be useful as test cases, other constraints may be enforced on each image: * It was not enlarged from a smaller image. * If the source image was in a lossy or noisy format (JPG, GIF, scanned newsprint, etc.), it was scaled down to reduce those artifacts. * Padding is used only when the source background was already uniform, so that it does not introduce significant artifacts (sharp edges, streaks, anomalous Fourier components, etc.) Source images that are too small to be used singly may be clipped/padded to 256x256 or 128x128 and combined into 2x2 or 4x4 mosaics. In this case all sub-images have the same nature (e.g. plants, microbes, faces, etc.). CHOOSING THE TARGET DIMENSIONS FOR A STANDARDIZED DATABASE Possible options for SQUARE images: * {5*2^K} or {7*2^K}, for {K >= 0}. Sizes are 5,7,10,14,20,28,40,56,80,112,160,224,320,448,640,896,... Scaling factors alternate between {7/5 = 1.4000} and {10/7 = 1.4286}. Aspect is always {1:1}. This series can be used for VGA images ({OMIN=480}) without padding and with minimal cropping. * {15*2^K} or {21*2^K}, for {K >= 0}. Sizes are 15,21,30,42,60,84,120,168,240,336,480,672,960,1344,... Scaling factors alternate between {7/5 = 1.4000} and {10/7 = 1.4286}. Aspect is always {1:1}. This series includes the VGA size ({OMIN=480}) without any padding/cropping. * {2*4^K} or {3*4^K}, for {K >= 0} (Chenca's series) Sizes are 2,3,4,6,8,12,16,24,32,48,64,96,128,192,256,384,512,768,... Scaling factors alternate between {3/2 = 1.5000} and {4/3 = 1.3333}. VGA images require substantial padding or cropping. Possible options for RECTANGULAR images: * {5*2^K × 7*2^K} or {7*2^K × 10*2^K}. Sizes are 5×7,7×10,10×14,14×20,20×28,28×40,40×56,56×80, 80×112,112×160,160×224,224×320,320×448,448×640,640×896,896×1280... Scaling factors alternate between {7/5 = 1.4000} and {10/7 = 1.4286}. Aspect alternates between {7:5 = 1.4000} and {10:7 = 1.4286}, close to the A-series aspect. This series includes a very close match (448×640) to the VGA size and aspect. Disadvantage: conversion between even and odd sizes uses different X and Y scales. * {25*2^K × 35*2^K} or {35*2^K × 49*2^K}, for {K >= 0}. Sizes are 25×35,35×49,50×70,70×98,100×140,140×196,200×280,280×392, 400×560,560×784,800×1120,... Scaling factors alternate between {7/5 = 1.4000} and {10/7 = 1.4286}. Aspect is always {7:5 = 1.4000}, close to the A-series aspect. Requires substantial cropping for VGA size. * {21*2^K × 28*2^K} or {15*2^K × 20*2^K}, for {K >= 0}. Sizes are 15×20,21×28,30×40,42×56,60×80,84×112,120×160,168×224, 240×320,336×448,480×640,672×896,960×1280,... Scaling factors alternate between {7/5 = 1.4000} and {10/7 = 1.4286}. Aspect is always {4:3 = 1.3333}, same as VGA size. Includes an exact match to the VGA size 480×640. * {2*4^K × 3*4^K} or {3*4^K × 4*4^K}, for {K >= 0} Sizes are 2×3,3×4,4×6,6×8,8×12,12×16,16×24,24×32,32×48,48×64, 64×96,96×128,128×192,192×256,256×384,384×512,512×768,768×1024,.. Scaling factors alternate between {3/2 = 1.5000} and {4/3 = 1.3333}. Aspect alternates between {3:2 = 1.5000} and {4:3 = 1.3333}. The aspect {4:3} matches the VGA aspect, but VGA images require substantial padding or cropping. Generating good rectangular picture series: gawk \ ' BEGIN { \ NX = 640; NY = 480; \ for (X = NX; X >= 0.8*NX; X--) \ for (Y = NY; Y >= 0.8*NY; Y--) \ { K = 1; RX=X; RY=Y; \ while (1) \ { A = RX/RY; LA = log(A/(NX/NY)); \ G = gcd(RX,RY); H = int(sqrt(2)*G + 0.5); \ S = H/G; LS = log(S/sqrt(2)); \ if ((RX <= 35) && (RY <= 35) && (H > G)) \ { printf "%4d × %4d = %4d × %4d * 2^%d", X, Y, RX, RY, K; \ printf " %2d %2d %6.4f %+06.3f", RX/G, RY/G, A, LA; \ printf " * %6.4f %+06.3f = %4d × %4d\n", S, LS, (RX*H)/G, (RY*H)/G; \ } \ if ((RX % 2 > 0) || (RY % 2 > 0)) { break; } \ RX=RX/2; RY=RY/2; K++; \ } \ } \ } \ function gcd(a,b,r){while(b>0){r=a%b;a=b;b=r;} return a;} \ ' Results for 640 × 480 (sorted and cleansed of duplicates): 512 × 480 = 32 × 30 * 2^5 16 15 1.0667 -0.223 * 1.5000 +0.059 = 48 × 45 528 × 480 = 33 × 30 * 2^5 11 10 1.1000 -0.192 * 1.3333 -0.059 = 44 × 40 544 × 480 = 34 × 30 * 2^5 17 15 1.1333 -0.163 * 1.5000 +0.059 = 51 × 45 512 × 448 = 16 × 14 * 2^6 8 7 1.1429 -0.154 * 1.5000 +0.059 = 24 × 21 560 × 480 = 35 × 30 * 2^5 7 6 1.1667 -0.134 * 1.4000 -0.010 = 49 × 42 576 × 480 = 18 × 15 * 2^6 6 5 1.2000 -0.105 * 1.3333 -0.059 = 24 × 20 544 × 448 = 34 × 28 * 2^5 17 14 1.2143 -0.094 * 1.5000 +0.059 = 51 × 42 528 × 432 = 33 × 27 * 2^5 11 9 1.2222 -0.087 * 1.3333 -0.059 = 44 × 36 512 × 416 = 32 × 26 * 2^5 16 13 1.2308 -0.080 * 1.5000 +0.059 = 48 × 39 560 × 448 = 35 × 28 * 2^5 5 4 1.2500 -0.065 * 1.4286 +0.010 = 50 × 40 576 × 448 = 18 × 14 * 2^6 9 7 1.2857 -0.036 * 1.5000 +0.059 = 27 × 21 544 × 416 = 34 × 26 * 2^5 17 13 1.3077 -0.019 * 1.5000 +0.059 = 51 × 39 512 × 384 = 8 × 6 * 2^7 4 3 1.3333 +0.000 * 1.5000 +0.059 = 12 × 9 640 × 480 = 20 × 15 * 2^6 4 3 1.3333 +0.000 * 1.4000 -0.010 = 28 × 21 512 × 384 = 32 × 24 * 2^5 4 3 1.3333 +0.000 * 1.3750 -0.028 = 44 × 33 528 × 384 = 33 × 24 * 2^5 11 8 1.3750 +0.031 * 1.3333 -0.059 = 44 × 32 560 × 400 = 35 × 25 * 2^5 7 5 1.4000 +0.049 * 1.4000 -0.010 = 49 × 35 544 × 384 = 34 × 24 * 2^5 17 12 1.4167 +0.061 * 1.5000 +0.059 = 51 × 36 640 × 448 = 20 × 14 * 2^6 10 7 1.4286 +0.069 * 1.5000 +0.059 = 30 × 21 576 × 384 = 9 × 6 * 2^7 3 2 1.5000 +0.118 * 1.3333 -0.059 = 12 × 8 640 × 384 = 10 × 6 * 2^7 5 3 1.6667 +0.223 * 1.5000 +0.059 = 15 × 9 Promising candidates for 1:3 aspect ratio are 512 × 384 = 8 × 6 * 2^7 1.3333 +0.000 * 1.5000 +0.059 = 12 × 9 640 × 480 = 20 × 15 * 2^6 1.3333 +0.000 * 1.4000 -0.010 = 28 × 21 Promising candidate for 7:5 aspect ratio are 560 × 400 = 35 × 25 * 2^5 1.4000 +0.049 * 1.4000 -0.010 = 49 × 35