Assignment 2: word zone hypotheses

The page may be slightly changed in the future. Be sure to refresh (F5) right before you start.

Goal

The goal of this assignment is to make a generator for word zone hypotheses.

A word zone hypothesis is a set of coordinates defining a region (zone) that possibly contains a word. This can be the first step of a word-based handwriting recognizer. To save you some hassle, the line zones are provided.

Assignment

Make a word zone hypothesis generator. Given an image and a file describing the line zones, output a new file describing an abundance of word zones.

The command to call the program should be like this:
python zonehypo.py input.ppm input.words output.words
where:

The program should ignore word zones if they are present.

The input and output files with extension .words have a format as defined here: .words file description. Reading and writing such files is made easy by the provided wordio.py.

Make sure that the C++ part of your code compiles with the command 'make'. This is required.

Submit your program using Nestor's Digital Dropbox.

Hints

Grading

Appendix: Description of files in assignment2/

FileDescription
cocos_arnold/C++ routines for fast connected components labeling by Arnold Meijster (no need to look inside).
cocoslib.cppProcedures to compute connected components in document images (uses cocos_arnold/).
cocoslib.hDescribes what you can do with cocoslib.cpp.
cocoslib.iInterface between C++ and Python for cocoslib.
example_cocos.pyProvides a quickstart for using connected components.
generate_wordzones.pyIllustrates reading an image, reading a .words file with lines and writing a .words file with random word zones.
pageview.pyViewer component, used by word-annotation.py.
pamImage.cpp@Link to file in toolbox/
pamImage.h@Link to file in toolbox/
pamImage.i@Link to file in toolbox/

Last modified: 29 April 2009 by Axel Brink.