Olarik Surinta, Artificial Intelligence and Cognitive Engineering (ALICE)

"Working at the frontiers of knowledge", RUG


    A* Path Planning for Line Segmentation of Handwritten Documents

      Monk Line Segmentation Dataset (MLS)
      results of A* Path Planning for Line Segmentation of Handwritten Documents

      This paper describes the use of a novel A* path-planning algorithm for performing line segmentation of handwritten documents. The novelty of the proposed approach lies in the use of a smart combination of simple soft cost functions that allows an artificial agent to compute paths separating the upper and lower text fields. The use of soft cost functions enables the agent to compute near-optimal separating paths even if the upper and lower text parts are overlapping in particular places. We have performed experiments on the Saint Gall and Monk line segmentation (MLS) datasets. The experimental results show that our proposed method performs very well on the Saint Gall dataset, and also demonstrate that our algorithm is able to cope well with the much more complicated MLS dataset.

      Saint Gall dataset

      Captain's logs, 1777 Provincial archive, 1855 Early 15th century
      MLS dataset

    • O. Surinta, M. Holtkamp, M.F. Karaaba, JP. van Oosten, L.R.B. Schomaker and M.A. Wiering, "A* Path Planning for Line Segmentation of Handwritten Documents," in Frontiers in Handwriting Recognition (ICFHR), 2014 The fourteenth International Conference on, 2014. pp. 175-180. link poster pdf

    Recognizing Handwritten Characters

      Handwritten character recognition systems have several important application, such as zip-code recognition, writer identification for e.g. forensic research, searching in historical manuscripts, and others. For such applications, the system should be able to recognize handwritten characters written on many different kinds of documents, such as contemporary or historical manuscripts. The aim is to let the system automatically extract and recognize the characters that are embedded in the manuscript. The quality of the manuscript is one of the factors that can improve the recognition accuracy (Gupta et al., 2011). It is essential to deal with the different problems that occur in the manuscripts, such as distortions in a character image and the background noise that can appear during the scanning process. The aim of our work is to develop new algorithms that can obtain a high recognition accuracy.

      Thai character Bangla (Bengali) character Latin character
      Some examples of the Thai, Bangla, and Latin handwritten scripts.

      Histograms of Oriented Gradients (HOG)

    • O. Surinta, M.F. Karaaba, L.R.B. Schomaker and M.A. Wiering, "Recognition of handwritten characters using local gradient feature descriptors," in Engineering Applications of Artificial Intelligence, (45)2015, pp. 405-414. link pdf
    • O. Surinta, M.F. Karaaba, T.K. Mishra, L.R.B. Schomaker and M.A. Wiering, "Recognizing Handwritten Characters with Local Descriptors and Bags of Visual Words," in Engineering Applications of Neural Networks (EANN 2015), 2015 The 16th International Conference on, 2015, pp. 255-264. link slide pdf
    • O. Surinta, L.R.B. Schomaker, and M.A. Wiering, "A comparison of feature and pixel-based methods for recognizing handwritten Bangla digits," in Document Analysis and Recognition (ICDAR), 2013 International Conference on, 2013, pp. 165-169. link poster pdf
    • O. Surinta, L.R.B. Schomaker and M.A. Wiering, "Handwritten Character Classification Using the Hotspot Feature Extraction Technique," in Pattern Recognition Applications and Methods (ICPRAM), 2012 International Conference on , 2012. pp. 261-264. link poster pdf