Period 2b (block 4), 2014 Progress code: KIM.SCHR03
In this course you learn how an automatic handwriting recognizer works. You will make a recognizer yourself and write a scientific report on it. The focus is on using a character(ish) approach, which can be bootstrapped from labels at the word-level.
The handwriting material for this course is historical handwriting from the “Queen’s Cabinet” (Kabinet der Koningin, stored at the Dutch National Archive, Nationaal Archief, Den Haag) as shown in the figure to the right.
You are expected to form groups of 3 or 4 persons. Each group will work towards a handwriting recognition system that uses “smaller-than-words” chunks, such as characters. One of the first tasks of all teams is to annotate or mine character-labels from annotations at the word-level.
Halfway through the course (see the schedule below for the details), each team is expected to hand in a draft of the literature review and Method section of the report.
Programming is done in either Python, C, C++ or Java, or a combination. For instance, Python can be used for quickly creating the general framework; C++ for the low-level procedures. Details on how this works will be provided during the first practical session.
In order to facilitate the cooperation within a group, it is advised to use version control software. Well-known packages are git and subversion. If you would like to have a subversion repository, please ask Jean-Paul to set one up for you (it is a good idea to arrange for that before the first practical session, so you can start ‘committing’ changes the first session already).
At the end of the course, you submit the final version of your recognizer and a written, scientific report.
On Thursday Prof. dr. Schomaker will give a lecture, after which each group will present a progress update (more on that below).
On Wednesday practical sessions supervised by Jean-Paul van Oosten are scheduled; you can use these to work on your recognizer, collaborate with your group, ask questions, etc.
The final lecture, each group will present their entire classifier, the approach and empirical evaluation, including results on a separate, secret test-set (Jean-Paul will perform the final tests of your classifier).
Each lecture, starting from the second lecture, all groups are expected to give a progress report. The report should at least have the following components:
Each component needs to be properly documented and supported by either references or tables and graphs. Show that you read the articles (i.e., show what the article was about and the conclusions), and keep a list, you need it for the References section of your paper.
Each group member should have had the opportunity to show their presentational skills: divide all tasks, including the presentations between the group members equally. Appoint a person for maintaining the progress update PPT slides; a person overlooking overall system architecture, a person designing the empirical evaluation (test scripts), etc.
At the end of the course, you are expected to have written a handwriting recognition system. To test your recognizer, Jean-Paul will (compile and) run your code on a separate, secret test set. See the technical details on how to hand in your program, how to handle arguments, etc.
You will be graded on your participation in the group, on your presentation, empirical evaluation and programming, as well as on your report. The report is written individually, but parts of the report can be written as a draft by the group as a whole (note that this means that your final, personal report needs to be substantially different from the other reports of your group, especially in the Introduction and Discussion sections).
The final grade appears on Progress. There is no exam other than the final recognizer and written report.
The report needs to be a scientific paper about the handwriting recognition system that you built during the course.
Your report will be graded on the following subjects: title + abstract, introduction, method, formalisms, analysis, results + interpretation, references, engineering, science.
Direct your questions to Jean-Paul van Oosten.
Last modified: July 01, 2014, by Jean-Paul van Oosten
Part of the HWR course