<< prev: assignment 1 up: back to handwriting recognition course

Assignment 2: lxj-recognizer

Goal

Make a recognizer that transcribes a word image to 'lxj'-coding, as agreed upon during the lecture. This encoding captures information about ascenders and descenders. The program must be executed like this:
python lxj.py wordimage.ppm
The output should be like this:
lxxxjxlx
This is the correct output that your program should return when the image contains the word "Example". The exact lxj encoding is in this file: lxj-coding.table. You can suggest changes for this encoding until 2 May.
If you are interested, read Why Python?

Toolbox

The toolbox gives you a quick start: it saves you from the hassle of reading and writing images to/from files. Read about the details here.

If you work at home, then make sure that you have the required software installed: Python 2.4 (or newer), source code of the same Python version (that is, the development version), gcc, swig, convert (ImageMagick). Netpbm is recommended (for pnmshear).
Get the toolbox:
cp -r /home/student/hwr/toolbox ~/hwr/
Compile the code:
cd ~/hwr/toolbox/
make
The toolbox only works with raw .ppm/.pgm/.pbm files. That's just simple bitmaps (color, grayscale, black/white respectively). Convert your images to raw PPM:
convert -compress LZW image.tif image.ppm
Note: the 'compress LZW' option is misleading; there is no compression in the PPM format. You can remove the .tif version of the images.

Example programs

Try the cropper example:
python example-crop.py /tmp/NL_HaNa_H2_7823_xxxx.ppm cropped.ppm
It will create an image (cropped.ppm) of a part of the original image.
Look in the source code of example-crop.py and see how it can call a function in croplib.h and croplib.cpp using croplib.i. Also inspect pamImage.h and see how it's done in the Makefile. (You do not have to look in pamImage.cpp.)
Try the word cropper example:
python example-wordcrop.py /tmp/NL_HaNa_H2_7823_xxxx.ppm ../words/NL_HaNa_H2_7823_xxxx.words word.ppm
It will create an image of a word and show the typed annotation.
Try the feature extractor example:
python example-featextract.py word.ppm
It will compute and show a simple feature vector.
Look in the source code of example-featextract.py, featlib.h, featlib.cpp and featlib.i to see how it works.

Assignment

Now make your own recognizer that prints the 'lxj'-coding of a given word image.
You can train your program using a part of the word labels created by you and your fellow students. They can be found in /home/student/hwr/data/words-train.
Submit your program to Nestor's Digital drop box. Your program will be tested on word images that were labeled by you and the other students. The average edit distance (Levenshtein distance) between the result and the true answer determines your grade for this assignment.
Your program will appear in a hit list, sorted by average edit distance. You can submit multiple times. The last submission before the deadline detemines your grade.
Please take care that it is important that your program does not leave temporary files. Because of the automatic testing, temporary files can easily fill up the space on my account.

Hints

Use featextract.py as a basis to make your own lxj-recognizer.
All handwriting in the dataset has an average shear of about 45 degrees. You can unshear an image on the command line using
pnmshear angle imagefile.ppm > outfile.ppm or from Python:
import os os.system("pnmshear %i %s > %s" % (angle, imagefile, outfile))
You can use dist.py (in the toolbox) to compute the edit distance (Levenshtein distance) between two strings.
Handy links: Python tutorial - Python reference - C++ reference

Last modified: 30 April 2008, afternoon.