Next: Automatic recognition and
Up: Bi- and Multimodal
Previous: Handwriting-visual control
In bimodal handwriting & speech control, the user combines the Human
Output Channels (HOCs) of speech and handwriting in a combined way to
achieve a specific goal. A distinction must be made between textual input
and command input (see Appendix E ). In textual input, the
goal is to enter linguistic data into a computer system, either in the form
of digitized signals, or as (ASCII)-coded strings. In command input, the
user selects a specific command to be executed and adds arguments and
qualifiers to it. The term handwriting in the title includes pen-gesture
control for the current purpose. Handwriting & speech bimodality in the
case of textual input means a potentially increased bandwidth and
reliability, provided that the user is able to deal with the combined
speech and pen control. Handwriting & speech bimodality in the case of
command input allows for a flexible choice . As an example,
the user may say /erase/ and circle or tap an object with the pen
(, i.e. erase ``this''). Alternatively, the user may
draw a deletion gesture and say the name of an object to be deleted.
In the remainder of this section we will consider bimodality in speech and
handwriting from two viewpoints: (i) the automatic recognition and
artificial synthesis of these HOC data; and (ii), the mere storage and
replay of these HOC data. The accent will be on ``Control'', but we have
added some information on computer output media (COM), because of the often
encountered confusion with respect to the concepts of recognition vs.
synthesis. Furthermore, with speech, we mean the audio signal representing
spoken text, with ink, we mean the XY-trajectory representing written text.
Both signals are functions of time.
Esprit Project 8579/MIAMI (Schomaker et al., '95)
Thu May 18 16:00:17 MET DST 1995