Tutorial, presented at the ICDAR'99, Bangalore, Part (a) of tutorial, presented at the ICDAR'99, Bangalore, Sept. 19.





Developments in handwritten input

Lambert Schomaker
NICI / Nijmegen University
The Netherlands
hwr.nici.kun.nl


User & System



  • (I) HCI attacks PR: the unistroke concept

  • (II) multiple agents as an architecture to realize user-system collaboration

  • (III) the Scryption museum demo:
    Ä multiple-agent recognizer of on-line handwriting" or "How to amuse bored museum visitors"


Table 1: A taxonomy of pen-based input


 1
Textual Data Input

1.1
Conversion to ASCII
1.1.1
Free Text Entry
1.1.1.1
Fully unconstrained (size, orientation, styles) (e.g. PostIts)
1.1.1.2
Lineated form, no prompting, free order of actions
1.1.1.3
Prompted
1.1.1.3.1
Acknowledge by ÖK" dialog box
1.1.1.3.2
Acknowledge by Time-out (e.g. 800 ms)
1.1.1.3.3
Acknowledge by Gesture (see 2.3).

1.1.2
Boxed Forms
1.1.3
Virtual keyboard

1.2
Graphical text storage and communication (plain handwritten ink)

 2
Command Entry

2.1
Widget selection
2.2
Drag-and-drop operations
2.3
Pen gestures
2.3.1
Position-independent gestures.
2.3.2
Position-dependent context gestures.

2.4
Continuous control (e.g. sliders, ink thickness by pressure)

 3
Graphical Pattern Input

3.1
Free-style drawings
3.2
Flow charts and schematics
3.3
Mathematical symbols
3.4
Music scores

 4
Signature verification





Modes in red do not require pattern recognition



(a.I) Developments in handwritten input


Where's the bandwidth?



figures/my-penfield.gif
Penfield and Rasmussen (1950)

  • 'more neurons' means higher resolution and less noise

  • hand, mouth and tongue are over-represented

  • fine motor control with the hand: pen!

  • speech, too!


(a.I) Developments in handwritten input


HCI attacks PR: the unistroke concept



  • Goldberg & Richardson (1993):
    Ïf the pattern recognition is so difficult, simplify the alphabet because users have a better neural network anyway" (rephrased)

  • Received a lot of scientific criticism from both HCI and PR!

  • But became a huge success
    (Graffiti, PalmPilot etc.)

  • General result: Increased interest in the user of the pen-computing technology


 




(a.I) Developments in handwritten input


The G&R unistroke concept has four aspects!



  • A symbol is a single pen-down stroke

  • easy to be discriminated by a classifier

  • easy to be learned and memorized by the user

  • input and output location on screen are decoupled

==> Is this still 'handwriting'?


 




(a.I) Developments in handwritten input


G&R user interface: Spatial decoupling of Input/Output



figures/goldberg-decoupling.gif
Figure 1: Separating pen and work space gives (1) a better view on the screen, (2) allows for cheaper hardware, (3) is consistent with keyboard experience: no paper metaphor.

  • Advantage: early input validation!
  • Disadvantage: a discontinous form of writing.
  • At the other extreme is: "natural handwriting & lazy recognition" (Nakagawa, 1993)


 




figures/goldberg.gif

Figure 2: The Goldberg-Richardson alphabet. Note unistrokes and distinctive shapes.

 

figures/greek-unistrokes.gif

Figure 3: Experiments with unistroke characters and new gesture shapes. Note that for the user, Greek lower-case characters are easy to produce and remember, and relatively easy on the classifier (project MIAMI).


(a.I) Developments in handwritten input


Simple gesture classifier / Kohonen LVQ type



A unistroke gesture classifier, needing only 5-10 examples from a new user, works already quite well:

  • Feature vector: spatially resampled and size-normalized 30 coordinates (x,y) plus 29 running-angle values (cos(f),sin(f)) (total 118 feature values)

  • Learning a prototype [p\vec], from new sample [f\vec], at learning event k = 0,1,...:
    _
    p
     

    k 
    = a _
    f
     

    k 
     +  (1 - a) _
    p
     

    k-1 

    Learning rate is a = [0,1]. For k = 0, a = 1, otherwise a < 1, e.g., a = 0.2.

  • Matching: 1-NN, Euclidean distance


 




(a.II) Developments in handwritten input


Multiple agents and a user interface concept



  • Problem background: How to integrate bottom-up classification with top-down expectancies (Schomaker, Hoenkamp & Mayberry, 1998).

  • Architecture: A triple-agent system, User, Classifier, Parser

  • User interface concept:
    "Don't interfere with the ink unless something is wrong!!"

  • Application: "Write" programs in the Scheme programming language


 




(a.II) Developments in handwritten input


Multiple agents and a user interface concept



figures/triple-agent.gif
Figure 4: Three agents


 




(a.II) Developments in handwritten input


Multiple agents and a user interface concept



figures/main_window.gif
Figure 5: Sample screen. After classification, no machine-font text is displayed. Instead, the ink is left untouched. If uncertain classification takes place, coloring is used. The last token is not yet in the symbol list, therefore colored in red. The user, the shape classifier agent and the incremental-parser agent collaborate to yield "100% recognition". In the example, the written token base is unknown, as yet, but the parser expects a VAR, as indicated by the red button in the menu below. By clicking on this button, a new shape is added to the token shape table.


 




(a.III) Developments in handwritten input


Museum demo: multiple-agent HWR



  • Goal: Build a system for an exposition, showing how on-line recognition works to the general public (including children).

  • User requirements for a museum demo: robust system, robust interface, multiple handwriting styles, fun to use, informative



  • Implementation: Linux, two PCs, ethernet:

    • (1) user interface: Wacom Tablet + Apple color LCD flatscreen

    • (2) HWR engine: monitor shows system's interpretation of handwritten input


 




(a.III) Developments in handwritten input


Museum demo: multiple-agent HWR



Problem: how to explain the idea of agent negotiation?

  • Give the agents a face ® 'smiley'

  • Distinctive colors for the agent/smileys

  • The degree of classification confidence determines the shape of the smile

  • (that should do the job, right?)


 




(a.III) Developments in handwritten input


Museum demo: multiple-agent HWR



More problems...

  • The user's sense of purpose:
    "Write something!"
    "Why?"
    "What?"

  • ...the User Goal is a central concept in HCI and is the basis for formal modeling and prediction techniques.

  • Also: What is an appropriate lexicon type and size?

  • ==> write a Dutch city name. If a heraldic sign exists of that city, a .gif image is shown to the user as a reward



e.g., Amsterdam is: figures/amsterdam.gif


 




(a.III) Developments in handwritten input


A multiple-agent recognizer of on-line handwriting: the agents



agent/smiley Class Method Features
1 green char 1-NN, wEuclid, flat list (x,y) È(cos(f),sin(f))
2 pink char 1-NN, wEuclid, flat list I(x,y) 16x16
3 blue char hierarchical Kohonen map (x,y) È(cos(f),sin(f))
4 violet char 1-NN wEuclid geometrical character features
5 cyan char rule based structural character features
6 brown char rule based confusion rules & struct. features
7 red cursive cap. flat list based on hier. clust (x,y)...
8 yellow strokes (explicit) Markov 14-dim stroke feature vector
9 grey block print cap. MLP I(x,y) 16x16


figures/screendump-scryption.gif
(a) Developments in handwritten input


Hardware: writing desk

figures/writing-table.gif



  • .... not that this application is so common ....
  • but the relevant UI questions are generic:

    • what are the actual user goals?

    • what internal aspects of a recognizer must be visualized?

    • how to visualize numerical information (such as confidence values)?

 




(a) Developments in handwritten input
Conclusion





  • Too often, on-line HWR systems are detached from reality.

  • For example: does a user want to write a letter(2) or concentrate on the tedious submission of isolated words to an uncooperative recognizer?

  • Thinking about the user and the context of usage leads to new ways of integrating pattern classifiers within an application.

  • For some applications, grossly simplified unistroke classifiers already suffice.

  • For other applications (e.g., mobile faxing of notes) pattern recognition is not needed at all...

 




References

Goldberg, D. & Richardson, C. (1993)
        Touch-Typing with a Stylus. In: INTERCHI 1993, Bridges Between
        Worlds, 1993, 24-29 April (pp. 80-87).


Nakagawa, M, Machii, K., Kato, N. and Souya, T. (1993).
        Lazy Recognition as a Principle of Pen Interfaces, 
        INTERCHI'93 Adjunct Proc. (pp.89-90).

Penfield, W., and Rasmussen, T. (1950). 
        The cerebral cortex in man. N. Y.: Macmillan.

Plamondon, R., Lopresti, D.P., Schomaker, L.R.B. and Srihari, R. (1999). 
        On-line handwriting recognition. 
        Wiley Encyclopedia of Electrical & Electronics Engineering, 
        xx(x), xxx-xxx. 

Schomaker, L.R.B. (1998). 
        From handwriting analysis to pen-computer applications. 
        IEE Electronics Communication Engineering Journal, 10(3), pp. 93-102.

Schomaker, L., Hoenkamp, E. & Mayberry, M. (1998). 
        Towards collaborative agents for automatic on-line 
        handwriting recognition. Proceedings of the Third European 
        Workshop on Handwriting Analysis and Recognition, 14-15 July, 1998, 
        London: IEE Digest Number 1998/440, (ISSN 0963-3308), pp. 13/1-13/6.

Tutorial "new pen-based applications" ICDAR'99 Bangalore. Copyright 1999 L. Schomaker cogn-eng.gif