# Last edited on 2010-08-15 00:21:27 by stolfilocal OBTAINING DIBCO 2009 DATASET On 2010-08-06 I obtained the DIBCO 2009 Text Binarization Challenge images from "http://users.iit.demokritos.gr/~bgat/DIBCO2009/benchmark/" After unpacking with "unrar" I ran the script "DIBCO2009/convert-images.sh" to convert them to PNG. Then they were renamed as follows: FETCH/dibc2009/orig/${NNN}.png -- test images FETCH/dibc2009/true/${NNN}.png -- ground truth bitmaps CREATING THE TEXT REGION MASKS The next step is to create a mask "rmsk/${NNN}-${KKK}.png" for each image "orig/${NNN}.png" and "true/${NNN}.png" and each homogeneous text area (with same font size, font style, and fore/back colors). The mask must completely include the relevant parts of the groundtruth mask, and no undesired parts of the same. It may include arbitrary amounts of background, even if not homogeneous. make-region-masks.sh COPYING TO COMMON REPOSITORY Moving the files to the general repository, with set id "dibc2009": projdir=~/projects/text-tracking dset="dibc2009" forig="${projdir}/data/full/orig" ftrue="${projdir}/data/full/true" fmask="${projdir}/data/full/mask" mkdir -p {${forig},${ftrue},${fmask}}/${dset} for f in 01 02 03 04 05 ; do \ mv printed-test-RAW/P$f.png ${forig}/${dset}/0$f.png mv printed-test-TRU/P$f.png ${ftrue}/${dset}/0$f.png done