Handwriting-visual control

Next: Handwriting-speech control Up: Bi- and Multimodal Previous: Visual-gestural control

Handwriting-visual control

Bimodality in Handwriting-Visual Control can be interpreted in at least two ways:

Handwriting as a human output channel (HOC), controlling visual (i.e., graphical) properties of the computer output (COM)
Handwriting (HOC), combined with other forms of visible human behavior (HOC)

Ad 1. Originally, in the proposal stages of the MIAMI project, handwriting-visual control was more concerned with handwriting as a human output channel (HOC), controlling graphical properties of the computer output (COM), other than handwriting or drawing ink patterns (WT 2.4, [300, page 21,]). Furthermore, the accent lies on discrete, symbolic interaction, as opposed to continuous control, which is dealt with elsewhere (WT 2.3, [300, page 20,]).

More specifically, one may consider the control of graphical objects on a CRT screen, by using a pen for discrete selection, i.e., by pointing and tapping (cf. Appendix E.3.1 ), and the subsequent modification of graphical objects on a screen by the performance of pen gestures (Appendix E.3.3 ). An interesting aspect of these pen gestures is that they may or may not have iconic properties with respect to the graphical effect which is intended by the user. As an example, the user may draw an 'L'-shape in the vicinity of a rounded corner in order to sharpen the most adjacent vertex of the shown polyhedron, given the current projection on screen (iconic use). Alternatively, a pen gesture may have an arbitrary shape, bound to a command or function, which must be memorized by the user (referential use).

The advantage of pen gestures is that no screen space is being used by widgets such as tool bars and menus. Experienced CAD users already make use of so-called 'Marker' menus, instead of continually referring to possibly remote widget objects on screen. In the area of Personal Digital Assistants (PDAs), experiments are done with the iconic type of pen gestures, in that users are allowed to produce stylized scribbles which iconically depict larger graphical documents. In the user interface, a user-defined scribble then becomes the 'icon' with which the documents may be referred to in later usage. If it is on the screen, selection by tapping on the icon may be performed. However, if the icon is off the screen, the user may produce the (memorized) pen gesture to find the corresponding document. As a more specific example, in a multimedial encyclopedia, a list of pictures of castles may be produced by entering a stylized castle icon with the pen. Artists have shown interest in interfaces in which simple sketched compositions allow for the retrieval of paintings with a similar composition. Although such applications are far from possible with the current state of the art, it is one of the purposes of to explore the consequences of these ideas, and uncover basic mechanisms.

Ad 2. Handwriting as a human output channel (HOC), combined with other forms of visible human behavior (HOC) in control and manipulation. Although not originally foreseen, possibilities are present in the area of teleoperation and audio control, if one combines the two-dimensional pen movement in time with another visible signal such as the vertical distance between upper and lower lip () as observed by a camera pointed at the user's face. Consider for instance a robotical arm, of which the end-effector position in two dimensions is controlled by the pen on an electronic paper device (see 3.2.3 ), whereas the effector opening is controlled by the amount of the lips distance . The third dimension may be controlled by the diameter of the user's face as observed by the camera. For initial experiments, special coloring of the background and the lips may be performed to allow for an easier processing of the camera data.

However, as stated earlier, the control of graphical parameters of objects visualized by the computer, by means of pen gestures, will be the main topic of research in . An interesting application refers to the use of a stylized face to represent the state of a handwriting recognition agent. Most recognition systems deliver reliability estimates for the generated word or character class hypotheses. Usually, hypotheses with a subliminal probability are rejected. The facial expression of the miniature face representing the recognizer agent may smile in the case of neat handwriting input, and frown in the case of ``Rejects''. Research will have to decide whether this information is actually picked up by the user, or is considered as irrelevant marginal graphics of an application.

Although there is a healthy trend in hiding technical details of a system from the user, there may be system components, such as intelligent agents, whose (intermediate) decisions have to be made explicit in meaningful ways. The reason for this is that such decisions may be erroneous. It is extremely demotivating for users of speech and handwriting recognition software not to know why recognition sometimes fails. In handwriting-visual control, the use of facial expressions in a miniature live icon may be a good way of externalizing aspects of the internal system state, in a non-invasive manner. Such solutions fall under the ``antropomorphic'' or ``animistic'' category, as mentioned in 1.2.2 .

Next: Handwriting-speech control Up: Bi- and Multimodal Previous: Visual-gestural control

Esprit Project 8579/MIAMI (Schomaker et al., '95)
Thu May 18 16:00:17 MET DST 1995