Assignment 2: Ascenders and descenders

Make a recognizer lxj.py that translates word images to "lxj-notation". This notation can be interpreted in two ways:

Example

For example, with this input image:

In interpretation 1, the output would be:
xxxllxjxxj
In interpretation 2, the output would be:
xxxlxlxxjxxxj
You may choose either interpretation, but mention what interpretation you chose in a Readme file.

For testing, the images will be cut like the green rectangle:

Note that for example the 'f', 'G' and sometimes 's' cause troubles, because they have an ascender as well as a descender. These cannot be correctly represented using this encoding. In this assignment, they have to be represented like 'l', 'l' and 'x', respectively. For the final assignment, is may be useful to choose a better representation.

Input / output

The program must be executed like this:
python lxj.py wordimage.ppm
The output must be written to the standard output (the screen). In Python, this means using the 'print' keyword to output the lxj-string.

The program will be tested by feeding several images and measuring the average Levenshtein distance between the result and the real answer.

Before you start

Move your personally created .words files to a distant directory (for backup) and forget about them. Now use the .words files in /student/hwr/trainset to get word images and train your system.

Available tools

In /student/hwr/framework/:

Submission

Put your complete program including the Readme file in a .zip / .tar / .tgz / .tar.gz in Nestor's digital drop box. Visualization of annotation

Hiscore

lxj-hiscore!

Last modified: May 24, 2007 by Axel Brink.