Pushing Back Frontiers of Handwriting Recognition

Document Actions

The 10th International Workshop on Frontiers in Handwriting Recognition (IWFHR) took place in La Baule, France, on October 23-26, 2006. We took this opportunity to interview professor Lambert Schomaker, director of Artificial Intelligence Research at Rijksuniversiteit Groningen (Netherlands).

What was the genesis of these handwriting recognition researches that are now blooming?

Lambert Schomaker: It all started in the 60s with attempts to build optical-character recognition algorithms of the upper-case characters and digits of Fortran program source texts. The idea was that programmers would write an algorithm on paper, have it scanned and wait for the computer to perform the translation toward the machine. As a matter of fact, everybody thought handwriting recognition would be easier than speech recognition. Especially for digits the task was manageable. Here, you are solely dealing with a tiny set of 10 classes, from 0 to 9. As an additional advantage, postal automation and banks generated plenty of data for training purposes.

How did it evolve toward personal computer?

Already in the 60s, everybody watched Mister Spock’s computer in Star Trek with a pen computer ('Tablet PC'). In the 80s, fitting handwriting recognition capabilities into personal micros was emerging as a perspective. Apple started the move with its Newton, featuring a cursive handwriting recognition system. At the time, given the variety of style and writers, everyone in our field knew it simply wasn’t doable. Nonetheless… Apple came up with its product. It wasn’t a success.

The flop boded ill for the future…

A lot of people thought handwriting recognition simply wouldn’t work, period. But in the universities, people kept on exploring this research field. Today, it starts yielding results, although quality isn’t high enough yet.

How would you segment this field of research?

One could distinguish four sectors. First, the world of TabletPC, along Microsoft’s framework. It can carry an integrated system through which, at given time, the user will operate the handwriting recognition software on the on-screen buttons with the pen, interactively.

The second sector is cheque and address recognition. Contrary to the expectation of futurologists, paper hasn’t disappeared. In many countries, including France and the US, cheques remain a major payment method. One tends to forget about it, but in this type of application, handwriting recognition has been greatly successful. It contributes to a lot of money saving.

Third, come security and biometrics. In recent years, we have experienced tremendous improvement in the areas of signature verification and author identification (1). Due to renewed concern about terrorism, demand for such systems has surged. For instance, I work with NFI, the Netherlands Forensic Institute, on certain types of problematic documents : blackmail messages, bogus bomb warning, forged letters and so on.

Computerized handwriting recognition can prove efficient. Consider a base of, say, 20 000 persons. There is no way identities can be checked manually. It is just too time consuming. But in a Google fashion, any handwriting recognition engine can return results that, while not been 100% relevant, remain nevertheless very satisfactory. Imagine for instance that, out or our 20 000 people sample, the engine returns a short list of the 10 most relevant identities. That’s already a huge accomplishment. One can then fine-tune and finish the verification manually. In such configuration, it doesn’t take a perfect system to reach high work efficiency .

The fourth sector deals with historical documents. That remains very hard, so we are having a lot of fun with it. Picture this: each and every period comes with its own specific writing. We are asked to browse through archives for which we don’t have any label, making the training all the more difficult. We need 5 000 examples for one single character, while all we are left with are pictures of the documents.

So far this last field is concerned, is the task hopeless then?

We expect the Internet to be of a great help here. We figure a vast public is interested in genealogy. We can initiate a collaborative work whereby the Internet users will be invited to give their version for the family names written on the documents. Conceivably, we can collect various suggestions for one single name. Out of these various labels, we could select the most probable one through some election process.

Having said that, the variety of handwriting is huge. We stumble upon far too many machine mistakes and their consequences in term of poor productivity. This means there is something that we are not doing properly. And it is precisely the difficulty of it that makes these researches so interesting and challenging.

Are we to expect breakthroughs on the device front?

One can expect progresses in the wake of electronic paper: new thin plastic displays of A4 size. Right now, PDA screens are just too small. For instance, in our conferences, I have never seen a single professor pulling out his PDA in order to show us something. Now, if instead of my mouse, I can unroll my sheet of electronic paper and point a finger as I would use a mouse, then, I am dealing with something fully operational.

You played an active role in the International Unipen Foundation and its eponymous standard format of handwriting. How are things evolving so far format is concerned?

Unipen was born of a compromise between 40 some companies gathered as a consortium. The Unipen format isn’t a ideal one, and we know it. Furthermore, the industry could legitimately reproach the small size of our databases. Also, the collection was intended to be difficult, with a lot of writers, tablet styles etc. For some data sets, a 80% recognition rate on 200 words is already quite good. However, the simple fact that we have an international standard format is enabling us to compare results, which is the way things work in all scientific communities. I was pleased to see that there is a good number of papers presented here in La Baule which use the Unipen data.

We started working on InkML (2) in the year 2000, with IBM, Motorola and others. It derives from XML and brings a lot of advantages. But what one must understand that industry and scientists do not share common interests. Manufacturers want to fit their mobile equipments with ever lighter heavily compressed formats. Conversely, we, in the academic world, need advanced formats featuring far more context-related details. It was difficult for me to influence the lengthy discussions in the industry consortium. Therefore, I decided I would let them to their job. In the meantime my colleague Louis Vuurpijl has explored the possibilities of adding our requirements on top of InkML, together with the company Hewlett Packard.

So scientists migrated from Unipen to UPX…

UPX is an InkML-based language that meets our specifically research-focused requirements for system training and evaluation. We just use a sub set of InkML. In our universities, we run tests on powerful computers so we do not need special InkML functions such as compression that much. The graphical rendering functions may become important in the future. On the other hand, a lot of consumer products will immediately enjoy the new and advanced functionalities of InkML.

Will the industry follow?

Both for InkML and UPX, the presence of applications or demo programs is a key factor. UPX is currently in its 0.9.5 version, we will do our best to interest both universities and company R&D groups. Next step should deal with portability toward Windows. Microsoft's acceptance of InkML is also critical. It would be great if Tablet-PC applications had InkML file export facilities. The W3C consortium is currently evaluating the InkML format for consolidation. After that, things may move fast. The more InkML data is around, the easier it will be to wrap up new data sets into UPX databases for the pattern-recognition community. The need for labeled data is still insatiable!

During the IWFHR Conference, Professor Lambert Schomaker and colleague Marius Bulacu, presented their research in combining multiple features for text-independant writer identification and verification. They are probability distribution functions (PDFs) extracted from the handwriting indepently of the textual content of the written samples. Schomaker and Bulacu performed an analysis of feature combinations, showing that fusing multiple features yields increases performance.

InkML stands for Ink Mark-up Language. It is an XML data format for representing digital ink data that is input with an electronic pen or stylus as part of a multimodal system. It is been developed within the framework of the World Wide Web Consortium. W3C develops interoperable technologies for the Internet.

In the early days, hardware and software builders heavily relied on proprietary formats for their storage and representation of digital ink. Unfortunately, these formats were poorly compatible.

Proliferation of such heterogeneous formats has notably limited the use of digital ink across devices developed by the industry. Hence the need for a standard format.

InkML is a platform-neutral data format designed to promote the interchange of digital ink between software applications.

It supports a complete and accurate representation of digital ink. It allows recording of information about transducer device characteristics. It details dynamic behaviour to support applications such as handwriting recognition and authentication.

Last modified 2006/12/06 17:50

expired

IRISA

Sections

Pushing Back Frontiers of Handwriting Recognition

Document Actions