A Kohonen self-organized map of velocity-based strokes
Given a segmentation of on-line handwriting in velocity-based strokes
(VBSs), for each stroke a feature vector can be determined.
The VBSs of handwriting samples from a large number
of writers are presented to this Kohonen
self-organizing map (SOM). This method is essentially a form of dimensionality
reduction and feature vector quantization. The network consists of a number
of 'cells', organized as a string (1D), sheet (2D), or cube (3D) etc. and
starts the training with random values in the feature vector belonging to
a cell. Here, we use a 2D network of 20x20 cells. From the huge training set
of VBSs (typically over 50k strokes), a single stroke is selected at random
and presented to the network. The nearest neighbour is searched and
a region of cells (a bubble) in the network is made a little bit more similar
to the input stroke. Several learning rules are possible for this update.
Then the next sample stroke is drawn from the training set, and so forth.
The essential trick of the Kohonen SOM is, that initially in training a large
region of cells is updated, whereas at the end of training, only the vector
of the best fitting cell is updated.
A Kohonen self-organized map of velocity-based strokes
Notes:
- Stroke feature vector: (1) vertical start lavel, (2) vertical stop level,
(3-7) five consecutive angles along the trace of this stroke, (8,9) last two
angles of the previous stroke in the word, (10,11) first two angles of the
next stroke in the word, (12) loop area if this stroke loops with the next,
(13) pen-up/down flag, (14) total length of the stroke. Several other feature
schemes have been studied. The angular information seems to be more stable
than absolute or relative size-based features (Teulings & Schomaker, 1993).
- The idea to use strokes of handwriting instead of the speech feature
vectors that Kohonen himself used in the "Neural Typewriter", was put
forward by Piero Morasso (University of Genoa, DIST) while we collaborated
in the Papyrus Esprit project. The DIST group, however, uses different features
and has another approach to the overall recognition architecture.
- Up strokes are colored red, down strokes are colored blue, air strokes
are colored green.
No provision is made for almost-horizontal strokes in this coding scheme,
so they may be red or blue, depending on the accidental start and ending
levels.
- A little black dot is shown at the starting point of a stroke.
There is a slide show of the training process of one of our VBS-based
Kohonen self-organized maps, produced by Louis Vuurpijl.
Please refer to our Publications
when using anything from the shown material.
to the "NICI stroke-based recognizer of on-line handwriting" page
Other interesting material:
Handwriting Recognition and Document Analysis Conferences
Pen & Mobile Computing
NICI Handwriting Recognition Group home page
UNIPEN tools
Handwriting-related Java demos
Copyright Lambert Schomaker (April 1, 1996)
since 1/Mar/1996