Tutorial, presented at the ICDAR'99, Bangalore,

Tutorial, presented at the ICDAR'99, Bangalore, Part (a) of tutorial, presented at the ICDAR'99, Bangalore, Sept. 19.

(go Up)

Developments in handwritten input
Lambert Schomaker
NICI / Nijmegen University
The Netherlands
hwr.nici.kun.nl

User & System

(I) HCI attacks PR: the unistroke concept

(II) multiple agents as an architecture to realize user-system collaboration

(III) the Scryption museum demo:
Ä multiple-agent recognizer of on-line handwriting" or "How to amuse bored museum visitors"

Table 1: A taxonomy of pen-based input

1

Textual Data Input

1.1

Conversion to ASCII

1.1.1

Free Text Entry

1.1.1.1

Fully unconstrained (size, orientation, styles) (e.g. PostIts)
1.1.1.2

Lineated form, no prompting, free order of actions
1.1.1.3

Prompted

1.1.1.3.1

Acknowledge by ÖK" dialog box
1.1.1.3.2

Acknowledge by Time-out (e.g. 800 ms)
1.1.1.3.3

Acknowledge by Gesture (see 2.3).

1.1.2

Boxed Forms
1.1.3

Virtual keyboard

1.2

Graphical text storage and communication (plain handwritten ink)

2

Command Entry

2.1

Widget selection
2.2

Drag-and-drop operations
2.3

Pen gestures

2.3.1

Position-independent gestures.
2.3.2

Position-dependent context gestures.

2.4

Continuous control (e.g. sliders, ink thickness by pressure)

3

Graphical Pattern Input

3.1

Free-style drawings
3.2

Flow charts and schematics
3.3

Mathematical symbols
3.4

Music scores

4

Signature verification

Modes in red do not require pattern recognition

(a.I) Developments in handwritten input
Where's the bandwidth?

Penfield and Rasmussen (1950)

'more neurons' means higher resolution and less noise

hand, mouth and tongue are over-represented

fine motor control with the hand: pen!

speech, too!

(a.I) Developments in handwritten input
HCI attacks PR: the unistroke concept

Goldberg & Richardson (1993):
Ïf the pattern recognition is so difficult, simplify the alphabet because users have a better neural network anyway" (rephrased)

Received a lot of scientific criticism from both HCI and PR!

But became a huge success
(Graffiti, PalmPilot etc.)

General result: Increased interest in the user of the pen-computing technology

(a.I) Developments in handwritten input
The G&R unistroke concept has four aspects!

A symbol is a single pen-down stroke

easy to be discriminated by a classifier

easy to be learned and memorized by the user

input and output location on screen are decoupled

==> Is this still 'handwriting'?

(a.I) Developments in handwritten input
G&R user interface: Spatial decoupling of Input/Output

Figure 1: Separating pen and work space gives (1) a better view on the screen, (2) allows for cheaper hardware, (3) is consistent with keyboard experience: no paper metaphor.

Advantage: early input validation!
Disadvantage: a discontinous form of writing.
At the other extreme is: "natural handwriting & lazy recognition" (Nakagawa, 1993)

Figure 2: The Goldberg-Richardson alphabet. Note unistrokes and distinctive shapes.

Figure 3: Experiments with unistroke characters and new gesture shapes. Note that for the user, Greek lower-case characters are easy to produce and remember, and relatively easy on the classifier (project MIAMI).

(a.I) Developments in handwritten input
Simple gesture classifier / Kohonen LVQ type

A unistroke gesture classifier, needing only 5-10 examples from a new user, works already quite well:

Feature vector: spatially resampled and size-normalized 30 coordinates (x,y) plus 29 running-angle values (cos(f),sin(f)) (total 118 feature values)

Learning a prototype [p\vec], from new sample [f\vec], at learning event k = 0,1,...:

_
p

k
= a _
f

k
+ (1 - a) _
p

k-1

Learning rate is a = [0,1]. For k = 0, a = 1, otherwise a < 1, e.g., a = 0.2.

Matching: 1-NN, Euclidean distance

(a.II) Developments in handwritten input
Multiple agents and a user interface concept

Problem background: How to integrate bottom-up classification with top-down expectancies (Schomaker, Hoenkamp & Mayberry, 1998).

Architecture: A triple-agent system, User, Classifier, Parser

User interface concept:
"Don't interfere with the ink unless something is wrong!!"

Application: "Write" programs in the Scheme programming language

(a.II) Developments in handwritten input
Multiple agents and a user interface concept

Figure 4: Three agents

(a.II) Developments in handwritten input
Multiple agents and a user interface concept

Figure 5: Sample screen. After classification, no machine-font text is displayed. Instead, the ink is left untouched. If uncertain classification takes place, coloring is used. The last token is not yet in the symbol list, therefore colored in red. The user, the shape classifier agent and the incremental-parser agent collaborate to yield "100% recognition". In the example, the written token base is unknown, as yet, but the parser expects a VAR, as indicated by the red button in the menu below. By clicking on this button, a new shape is added to the token shape table.

(a.III) Developments in handwritten input
Museum demo: multiple-agent HWR

Goal: Build a system for an exposition, showing how on-line recognition works to the general public (including children).

User requirements for a museum demo: robust system, robust interface, multiple handwriting styles, fun to use, informative

Implementation: Linux, two PCs, ethernet:

(1) user interface: Wacom Tablet + Apple color LCD flatscreen

(2) HWR engine: monitor shows system's interpretation of handwritten input

(a.III) Developments in handwritten input
Museum demo: multiple-agent HWR

Problem: how to explain the idea of agent negotiation?

Give the agents a face ® 'smiley'

Distinctive colors for the agent/smileys

The degree of classification confidence determines the shape of the smile

(that should do the job, right?)

(a.III) Developments in handwritten input
Museum demo: multiple-agent HWR

More problems...

The user's sense of purpose:
"Write something!"
"Why?"
"What?"

...the User Goal is a central concept in HCI and is the basis for formal modeling and prediction techniques.

Also: What is an appropriate lexicon type and size?

==> write a Dutch city name. If a heraldic sign exists of that city, a .gif image is shown to the user as a reward

e.g., Amsterdam is:

(a.III) Developments in handwritten input
A multiple-agent recognizer of on-line handwriting: the agents

agent/smiley Class Method Features

1 green char 1-NN, wEuclid, flat list (x,y) È(cos(f),sin(f))
2 pink char 1-NN, wEuclid, flat list I(x,y) 16x16
3 blue char hierarchical Kohonen map (x,y) È(cos(f),sin(f))
4 violet char 1-NN wEuclid geometrical character features
5 cyan char rule based structural character features
6 brown char rule based confusion rules & struct. features
7 red cursive cap. flat list based on hier. clust (x,y)...
8 yellow strokes (explicit) Markov 14-dim stroke feature vector
9 grey block print cap. MLP I(x,y) 16x16

(a) Developments in handwritten input
Hardware: writing desk

.... not that this application is so common ....
but the relevant UI questions are generic:

what are the actual user goals?

what internal aspects of a recognizer must be visualized?

how to visualize numerical information (such as confidence values)?

(a) Developments in handwritten input
Conclusion

Too often, on-line HWR systems are detached from reality.

For example: does a user want to write a letter⁽²⁾ or concentrate on the tedious submission of isolated words to an uncooperative recognizer?

Thinking about the user and the context of usage leads to new ways of integrating pattern classifiers within an application.

For some applications, grossly simplified unistroke classifiers already suffice.

For other applications (e.g., mobile faxing of notes) pattern recognition is not needed at all...

References

Goldberg, D. & Richardson, C. (1993) Touch-Typing with a Stylus. In: INTERCHI 1993, Bridges Between Worlds, 1993, 24-29 April (pp. 80-87). Nakagawa, M, Machii, K., Kato, N. and Souya, T. (1993). Lazy Recognition as a Principle of Pen Interfaces, INTERCHI'93 Adjunct Proc. (pp.89-90). Penfield, W., and Rasmussen, T. (1950). The cerebral cortex in man. N. Y.: Macmillan. Plamondon, R., Lopresti, D.P., Schomaker, L.R.B. and Srihari, R. (1999). On-line handwriting recognition. Wiley Encyclopedia of Electrical & Electronics Engineering, xx(x), xxx-xxx. Schomaker, L.R.B. (1998). From handwriting analysis to pen-computer applications. IEE Electronics Communication Engineering Journal, 10(3), pp. 93-102. Schomaker, L., Hoenkamp, E. & Mayberry, M. (1998). Towards collaborative agents for automatic on-line handwriting recognition. Proceedings of the Third European Workshop on Handwriting Analysis and Recognition, 14-15 July, 1998, London: IEE Digest Number 1998/440, (ISSN 0963-3308), pp. 13/1-13/6.

Tutorial "new pen-based applications" ICDAR'99 Bangalore. Copyright 1999 L. Schomaker

agent/smiley	Class	Method	Features

1 green	char	1-NN, wEuclid, flat list	(x,y) È(cos(f),sin(f))
2 pink	char	1-NN, wEuclid, flat list	I(x,y) 16x16
3 blue	char	hierarchical Kohonen map	(x,y) È(cos(f),sin(f))
4 violet	char	1-NN wEuclid	geometrical character features
5 cyan	char	rule based	structural character features
6 brown	char	rule based	confusion rules & struct. features
7 red	cursive cap.	flat list based on hier. clust	(x,y)...
8 yellow	strokes	(explicit) Markov	14-dim stroke feature vector
9 grey	block print cap.	MLP	I(x,y) 16x16