Short CV

Dr. Nikolaos Stamatopoulos was born in 1984 in Athens, Greece. He received his Bachelor in Informatics and Telecommunications in 2006 and his Ph.D. degree in 2011, both from the Department of Informatics and Telecommunications of National and Kapodistrian University of Athens. His Ph.D Thesis is on Optical Process and Analysis of Historical Documents. He is currently working as research associate at the Institute of Informatics and Telecommunications of the National Center for Scientific Research "Demokritos", Athens, Greece. He has participated in several research and industrial projects and he is program committee member of several international Conferences and Workshops (ICDAR, ICFHR, DAS, MEDPRAI). His main research interests are in Image Processing and Document Image Analysis, Processing of Historical Documents, OCR and Pattern Recognition and he has authored many papers in journals and conference proceedings in the above areas.

R&D Projects

2015 - Present

Purebills

PureBills project is an industrial project in collaboration with Octangon5 Pty Ltd for the automatic identification, recognition and indexing of paper and electronic invoices and receipts. The software application automatically extracts indexing information from single or multi-page invoices and receipts.

2016 - 2019

READ

READ (Recognition and Enrichment of Archival Documents) is a project funded by the EU (H2020-EU.1.4.1.3). The overall objective of READ is to implement a Virtual Research Environment where archivists, humanities scholars, computer scientists and volunteers are collaborating with the ultimate goal of boosting research, innovation, development and usage of cutting edge technology for the automated recognition, transcription, indexing and enrichment of handwritten archival documents.

2013 - 2016

DRVS

DRVS (Digital Receipt Validation System) project is an industrial project in collaboration with TPG Rewards Inc. for automatic processing of receipt images in order to detect the existence of specific product names. In case of positive detection, the corresponding price of the product as well as the date/time and the total amount of the receipt are also detected and recognized.

2013 - 2015

tranScriptorium

tranScriptorium is a STREP of the Seventh Framework Programme in the ICT for Learning and Access to Cultural Resources challenge. It aims to develop innovative, efficient and cost-effective solutions for the indexing, search and full transcription of historical handwritten document images, using modern, holistic Handwritten Text Recognition (HTR) technology.

2014 - 2015

OldDocPro

OldDocPro is a Greek project and it aims to develop a innovative system for the recognition of Greek machine-printed and handwritten polytonic (multi accent) documents. Old Greek polytonic scripts have a large variety of diacritic marks and as a result a large number of character classes (more than 270).

2012

CITO

CITO project is an industrial project in collaboration with NeuroScript LLC for processing of handwritten document images written by kids at school. The final goal is the calculation of various characteristics of children's handwriting.

2008 - 2012

IMPACT

IMPACT (Improving Access to Text) is a project funded by the EU (FP7-ICT-2007-1). It aims to significantly improve access to historical text and to take away the barriers that stand in the way of the mass digitization of the European cultural heritage. IMPACT as a network of centres of competence brings together fifteen national and regional libraries, research institutions and commercial suppliers.

2007 - 2008

POLYTIMO

Polytimo is a Greek GSRT project. It aims to develop a innovative system for processing, managing and providing access to the content of valuable historical books and handwritten manuscripts.

Competitions

Nikolaos Stamatopoulos has co-organized the following competitions



International Conference on Document Analysis and Recognition 2017 (ICDAR2017)

International Conference on Frontiers in Handwriting Recognition 2014 (ICFHR2014)

International Conference on Frontiers in Handwriting Recognition 2012 (ICFHR2012)

International Conference on Document Analysis and Recognition 2011 (ICDAR2011)

International Conference on Frontiers in Handwriting Recognition 2010 (ICFHR2010)

International Conference on Document Analysis and Recognition 2009 (ICDAR2009)

International Conference on Document Analysis and Recognition 2007 (ICDAR2007)

Publications

Ph.D. Dissertation

N. Stamatopoulos, "Optical Process and Analysis of Historical Documents", 2011. (in Greek) Download Paper


An extended abstract of the dissertation in English can be found here Download Paper

Book Chapters

  • N. Stamatopoulos, G. Louloudis and B. Gatos, Handwriting Segmentation, in the book “Document Analysis and Text Recognition: Benchmarking State-of-the-Art Systems", World Scientific Publishing Co., ISBN: 978-981-3229-26-6, 2018. Link

  • Document image segmentation is the process of dividing a document image into its base text components (blocks, text lines, words, characters). One of the most important and challenging tasks of document image analysis is the segmentation of handwritten document images into text lines and words. The overall performance of a character recognition or a word spotting system strongly relies on the results of the text line and word segmentation process. Although text line and word segmentation for machine-printed documents, and especially modern, is usually considered as a solved problem, segmentation of handwritten document images still presents significant challenges and it is an open problem. Different types of challenges are encountered in the handwritten text line and word segmentation processes. Concerning text line segmentation, difference in the skew angle between text lines, curvilinear text lines, variation in inter-line gaps, overlapping and touching text lines that frequently appear in handwritten document images are some of the challenging issues. Furthermore, the appearance of accents in some languages (e.g. Greek) increases segmentation complexities. Regarding word segmentation, the challenges include the appearance of skew along a single text line, the existence of slant, the non-uniform spacing of words as well as the existence of punctuation marks. Over the last decade, a wide variety of segmentation methods for handwritten document images has been reported in the literature. Moreover, four handwriting segmentation competitions have been organized in the context of the ICDAR and ICFHR conferences (ICDAR2007, ICDAR2009, ICFHR2010 and ICDAR2013) in order to address the need for objective, comparative and detailed evaluation under realistic circumstances and standard datasets. In this chapter, the evaluation results of the handwriting segmentation competitions are summarized in terms of a detailed description of the benchmarking datasets and the evaluation protocol used. Moreover, a brief description of the participating methods complemented by recently published methods which report on the competition’s data are presented.

  • G. Louloudis, N. Stamatopoulos and B. Gatos, Writer Identification, in the book “Document Analysis and Text Recognition: Benchmarking State-of-the-Art Systems", World Scientific Publishing Co., ISBN: 978-981-3229-26-6, 2018. Link

  • Writer identification is a behavioral handwriting-based recognition problem which proceeds by matching unknown handwritings against a database of samples with known authorship. From the document image analysis scope, writer identification can be defined as the retrieval of handwritten samples of the same writer from a database using a handwritten sample as a graphical query. The large number of recent publications as well as the organization of several competitions, proves that writer identification is a very active and promising area of research. The identification of the writer of a handwritten document has a wide variety of applications. For example, analysis of handwritten documents has great bearing on the criminal justice systems. As stated by Srihari et al. “Numerous cases over the years have dealt with evidence provided by handwritten documents such as wills and ransom notes.” Other application areas include security, financial activity, forensic analysis and access control. The main challenges of a writer identification system as described by Schomaker et al. include the variability and variation of handwritten patterns even among documents of the same writer, the limited amount of image data and the presence of noise patterns. Another challenge concerns the large number of classes (writers) among which the final decision should be taken. In this chapter, we summarize the results of the writer identification competition series for Latin documents presented in the ICDAR and ICFHR conferences including the benchmarking datasets, the evaluation protocol, the participating methods together with several recently published methods which made use of the benchmarking datasets and finally, we draw some comments and conclusions.

  • B. Gatos, G. Louloudis, N. Stamatopoulos and G. Sfikas, Historical Document Processing, in the book “Handwriting: Recognition, Development and Analysis", Nova Science Publishers, ISBN: 978-1-53611-957-2, 2017. Link

  • Historical manuscript collections can be considered as an important source of original information in order to provide access to historical data and develop cultural documentation over the years. This chapter reports on recent advances and ongoing developments for historical handwritten document processing. It outlines the main challenges involved, the different tasks that have to be implemented as well as practices and technologies that currently exist in the literature. The focus is given on the most promising techniques as well as on existing datasets and competitions that can be proved useful to historical handwritten document processing research. The main tasks that have to be implemented in the historical document image recognition pipeline, include preprocessing for image enhancement and binarization, segmentation for the detection of main page elements, text lines and words and, finally, recognition. In cases where optical recognition is expected to give poor results, keyword spotting has been proposed to substitute full-text recognition. The organization of this chapter is as follows. Section “Preprocessing” gives an overview of document image enhancement and binarization methods while section “Segmentation” presents layout analysis, text line and word segmentation state-of-the-art techniques for historical handwritten documents. In section “Handwritten Text Recognition (HTR)” the focus is on the pure recognition task which can be accomplished on text line, word or character level. Finally, in section “Keyword spotting” recent advances on searching for a keyword directly on the historical document images are presented.
Journals

  • G. Mühlberger, L. Seaward, ... , N. Stamatopoulos, ... , H. Wurster and K. Zagoris, “Transforming Scholarship in the Archives Through Handwritten Text Recognition: Transkribus as a Case Study”, Journal of Documentation, vol. 75, no. 5, pp. 954-976, 2019. impact factor: 0.853Download Paper

  • Archives are increasingly investing in the digitisation of their manuscript collections but until recently the textual content of the resulting digital images has only been available to those who have the time to study and transcribe individual passages. The use of computers to process and search images of historical papers using Handwritten Text Recognition (HTR) has the potential to transform access to our written past for the use of researchers, institutions and the general public. This paper reports on the Recognition and Enrichment of Archival Documents (READ) European Union Horizon 2020 project which is developing advanced text recognition technology on the basis of artificial neural networks and resulting in a publicly available infrastructure: the Transkribus platform. Users of Transkribus (whether institutional or individual) are able to extract data from handwritten and printed texts via HTR, while simultaneously contributing to the improvement of the same technology thanks to machine learning principles. The automated recognition of a wide variety of historical texts has significant implications for the accessibility of the written records of global cultural heritage. This paper uses the Transkribus platform as a case study, focusing on the development, application and impact of HTR technology. It demonstrates that HTR has the capacity to make a significant contribution to the archival mission by making it easier for anyone to read, transcribe, process and mine historical documents. It shows that the technology fits neatly into the archival workflow, making direct use of growing repositories of digitised images of historical texts. By providing examples of institutions and researchers who are generating new resources with Transkribus, the paper shows how HTR can extend the existing research infrastructure of the archives, libraries and humanities domain. Looking to the future, this paper argues that this form of machine learning has the potential to change the nature and scope of historical research. Finally, it suggests that a cooperative approach from the archives, library and humanities community is the best way to support and sustain the benefits of the technology offered through Transkribus.

  • G. Retsinas, G. Louloudis, N. Stamatopoulos and B. Gatos, “Efficient Learning-Free Keyword Spotting”, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 41, no. 7, pp. 1587-1600, 2019. impact factor: 8.329Download Paper

  • In this article, a method for segmentation-based learning-free Query by Example (QbE) keyword spotting on handwritten documents is proposed. The method consists of three steps, namely preprocessing, feature extraction and matching, which address critical variations of text images (e.g. skew, translation, different writing styles). During the feature extraction step, a sequence of descriptors is generated using a combination of a zoning scheme and a novel appearance descriptor, referred as modified Projections of Oriented Gradients. The preprocessing step, which includes contrast normalization and main-zone detection, aims to overcome the shortcomings of the appearance descriptor. Moreover, an uneven zoning scheme is introduced by applying a denser zoning only on query images for a more detailed representation. This leads to a significant reduction in storage requirements of a document collection. The distance between the query and word sequences is efficiently computed by the proposed Selective Matching algorithm. This algorithm is further extended to handle an augmented set of images originating from a single query image. The efficiency of the proposed method is demonstrated by experimentation conducted on seven publicly available datasets. In these experiments, the proposed method significantly outperforms all state-of-the-art learning-free techniques.
    1. L. Kopeykina and A.V. Savchenko, “Automatic Privacy Detection in Scanned Document Images Based on Deep Neural Networks”, International Russian Automation Conference (RusAutoCon'19), 2019
    2. V. Thakur and H. Sikarwar, “Deep Learning Feature Extraction for Handwritten Keyword Spotting in Historical Documents”, 2nd International Conference on Emerging Trends in Engineering & Applied Science (ICETEAS'19), vol. 5, no. 1, pp. 11 – 15, 2019

  • N. Stamatopoulos, B. Gatos and I. Pratikakis, “Performance Evaluation Methodology for Document Image Dewarping Techniques”, IET Image Processing, vol. 6, no. 6, pp. 738-745, 2012. impact factor: 0.895Download Paper

  • The performance evaluation of dewarping techniques is currently addressed by concentrating in visual pleasing impressions or by using OCR as a means for indirect evaluation. In this paper, we present a performance evaluation methodology that calculates a comprehensive evaluation measure which reflects the entire performance of a dewarping technique in a concise quantitative manner. The proposed evaluation measure takes into account the deviation of the dewarped text lines from an horizontal straight reference which is considered to be the optimal result. This measure is expressed by the integral over the dewarped text line curves. To reduce the manual effort for identifying the text lines in the dewarped image, we propose a point-to-point matching procedure that finds the correspondence between the manually marked warped document image and the dewarping counterpart. This enables an evaluation for unlimited number of methodologies addressing a marking procedure which is applied only once. The validity of the proposed performance evaluation methodology is demonstrated by a concise experimental work that comprises four state-of-the-art dewarping techniques as well as the involvement of different users in the interactive part of the procedure.
    1. C. Hong, S. Colburn and A. Majumdar, “Flat metaform near-eye visor”, Applied Optics, vol. 56, no. 31, pp. 8822-8827, 2017
    2. P. Yang, A. Antonacopoulos, C. Clausner, S. Pletschacher and J. Qi, “Effective geometric restoration of distorted historical document for large-scale digitisation”, IET Image Processing,vol. 11, no. 10, pp. 841-853, 2017
    3. M. Rahnemoonfar and B. Plale, “Automatic performance evaluation of dewarping methods in large scale digitization of historical documents”, 13th ACM/IEEE-CS Joint Conference on Digital Libraries (JCLD'13), pp. 331-334, Indiana, USA, 2013.

  • N. Stamatopoulos, B. Gatos, I. Pratikakis and S.J. Perantonis, “Goal-oriented Rectification of Camera-Based Document Images”, IEEE Transactions on Image Processing, vol. 20, no. 4, pp. 910-920, 2011. impact factor: 3.042 Download Paper

  • Document digitization with either flatbed scanners or camera-based systems results in document images which often suffer from warping and perspective distortions that deteriorate the performance of current OCR approaches. In this paper, we present a goal-oriented rectification methodology to compensate for undesirable document image distortions aiming to improve the OCR result. Our approach relies upon a coarse-to-fine strategy. First, a coarse rectification is accomplished with the aid of a computationally low cost transformation which addresses the projection of a curved surface to a 2-D rectangular area. The projection of the curved surface on the plane is guided only by the textual content's appearance in the document image while incorporating a transformation which does not depend on specific model primitives or camera setup parameters. Second, pose normalization is applied on the word level aiming to restore all the local distortions of the document image. Experimental results on various document images with a variety of distortions demonstrate the robustness and effectiveness of the proposed rectification methodology using a consistent evaluation methodology that encounters OCR accuracy and a newly introduced measure using a semi-automatic procedure.
    1. A. Garai and S. Biswas, “Dewarping of Single-Folded Camera Captured Bangla Document Images”, Computational Intelligence in Pattern Recognition (CIPR'19), pp. 647-656, 2019
    2. K.M. Hung, C.H. Yih and C.H. Yeh, “A Reading Assistant System Based on Restoring Warped Document Image”, Journal of Applied Science and Engineering, vol. 21, no. 3, pp. 475-484, 2018
    3. G. Meng, Y. Su, Y. Wu, S. Xiang and C. Pan, "Exploiting Vector Fields for Geometric Rectification of Distorted Document Images”, European Conference on Computer Vision (ECCV'18), pp. 172-187, 2018
    4. V.V. Vashi and D.G. Jani, “Review Paper based on Different Technologies to Read Text using Optical Character Recognition”, International Journal of Management, Technology And Engineering, vol. 8, no. V, pp. 7-10, 2018
    5. L. Galarza, H. Martin and M. Adjouadi, “Integrating low-resolution depth maps to high-resolution images in the development of a book reader design for persons with visual impairment and blindness”, International Journal of Innovative Computing, Information and Control (ICIC), vol. 14, no. 3, pp. 797-816, 2018
    6. R. Sun, S. Wang, L. Ji and Z. Wang, “Multi-scale document image rectification utilising text-features”, Electronics Letters, vol. 54, no. 8, pp. 502-503, 2018
    7. C. Yan, J. Hu and C. Zhang, “Deep Transformer: A Framework For 2D Text Image Rectification From Planar Transformations”, Neurocomputing, https://doi.org/10.1016/j.neucom.2018.02.015, 2018
    8. S. You, Y. Matsushita, S. Sinha Y. Bou and K. Ikeuchi, “Multiview Rectification of Folded Documents”, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 40, no. 2, pp. 505-511, 2018
    9. R. Sun, N. Li, S. Wang, L. Ji and Z. Wang, “The rectification of document images using text-features”, 7th International Conference on Virtual Reality and Visualization, (ICVRV'17), pp. 223-228, 2017
    10. A. Garai, S. Biswas, S. Mandal and B.B. Chaudhuri, “Automatic dewarping of Camera Captured Born-Digital Bangla Document Images”, 9th International Conference on Advances in Pattern Recognition (ICAPR'17), pp. 94-99, 2017
    11. W.T. Dar and M.N.A Khan, “Click-Free, Video-Based Document Capture - Methodology and Evaluation”, 7th International Workshop on Camera-Based Document Analysis and Recognition, (CBDAR'17), pp. 21-26, 2017
    12. P. Yang, A. Antonacopoulos, C. Clausner, S. Pletschacher and J. Qi, “Effective geometric restoration of distorted historical document for large-scale digitisation”, IET Image Processing, vol. 11, no. 10, pp. 841-853, 2017
    13. S. Das, G. Mishra, A. Sudharshana and R. Shilkrot, "The Common Fold: Utilizing the Four-Fold to Dewarp Printed Documents from a Single Image", ACM Symposium on Document Engineering (DocEng'17), pp. 125-128, 2017
    14. H. Eslami, A.A. Raie K. Faez, "Precise vehicle speed measurement based on a hierarchical homographic transform estimation for law enforcement applications", IEICE Transactions on Information and Systems, vol. E99D, no. 6, pp. 1635-1644, 2016.
    15. G. Meng, S. Xiang, C. Pan and N. Zheng, "Active Rectification of Curved Document Images Using Structured Beams", International Journal of Computer Vision, DOI: 10.1007/s11263-016-0952-z, 2016
    16. S. Kumar, K. Kumar, R.K. Mishra, "Scene Text Recognition using Artificial Neural Network: A Survey", International Journal of Computer Applications, vol. 137, no. 6, pp. 40-50, 2016
    17. S. Calarasanu, S. Dubuisson and J. Fabrizio, "Towards the rectification of highly distorted texts", 11th International Conference on Computer Vision Theory and Applications (VISAPP'15), pp. 1-8, 2016
    18. Z. Huang, J. Gu, G. Meng and C. Pan, "Text line extraction of curved document images using hybrid metric", 3rd IAPR Asian Conference on Pattern Recognition (ACPR), Malaysia, pp. 251-255, 2016
    19. C. Crovato, D. Torok, R. Heidrich, B. Cerqueira and E. Velho , “Preparing for OCR of Books Handled by Visually Impaired”, 10th International Conference Ubiquitous Computing and Ambient Intelligence (UCAmI'16), pp. 419-430, 2016
    20. Q. Ye and D. Doermann, "Text Detection and Recognition in Imagery: A Survey", IEEE Transactions on Pattern Analysis and Machine Intelligence , vol. 37, no. 7 , pp. 1480-1500, 2015
    21. G. Meng, Z. Huang, Y. Song, S. Xiang and C. Pan, "Extraction of Virtual Baselines from Distorted Document Images Using Curvilinear Projection", International Conference on Computer Vision (ICCV'15), pp. 3925-3933, 2015
    22. M.K. Alqudah, M.F. Bin Nasrudin, B. Bataineh, M. Alqudah and A. Alkhatatneh, "Investigation of binarization techniques for unevenly illuminated document images acquired via handheld cameras", 2nd International Conference on Computer, Communications, and Control Technology (I4CT'15) , pp. 524-529, 2015
    23. M. Fawzi, M.A. Rashwan, H. Ahmed, S. Samir, S.M. Abdou, H.M. Al-Barhamtoshy and K.M. Jambi, "Rectification of Camera Captured Document Images for Camera-Based OCR Technology", 6th International Workshop on Camera Based Document Analysis and Recognition (CBDAR'15) , pp. 1226-1230, Nancy, France, 2015
    24. B.S Kim, H.I. Koo and N.I. Cho, “Document Dewarping via Text-line based Optimization”, Pattern Recognition, doi:10.1016/j.patcog.2015.04.026, 2015
    25. M. P. Nevetha and A. Baskar, " Applications of Text Detection and its Challenges: A Review", 3rd International Symposium on Women in Computing and Informatics (WCI '15), pp. 712-721, 2015
    26. Y.S. Lin, K.H. Lo, H.T. Chen and J.H. Chuang, "Vanishing point-based image transforms for enhancement of probabilistic occupancy map-based people localization", IEEE Transactions on Image Processing, vol. 23, no. 12, pp. 5586-5598, 2014
    27. G. Meng, Y. Wang, S. Qu, S. Xiang, C. Pan, "Active flattening of curved document images via two structured beams", Conference on Computer Vision and Pattern Recognition (CVPR'14), Columbus, USA, pp. 3890-3897, 2014
    28. C. Liu, Y. Zhang, B. Wang and X. Ding, “Restoring camera-captured distorted document images”, International Journal on Document Analysis and Recognition (IJDAR), vol. 18, no. 2 pp. 111–124, 2014
    29. Q. Ye, "Text Detection and Recognition in Imagery: A Survey ", IEEE Transactions on Pattern Analysis and Machine Intelligence, DOI: 10.1109/TPAMI.2014.2366765, 2014
    30. W. Pan, Z. Lian, R. Sun, Y. Tang, and J. Xiao, "FlexiFont: a flexible system to generate personal font libraries", In Proceedings of the 2014 ACM symposium on Document engineering (DocEng '14), Colorado, USA, pp. 17-20, 2014
    31. L. Zhang, Q. Fan, Y. Li, Y. Uchimura and S. Serikawa, “An Implementation of Document Image Reconstruction System on A Smart Device Using a 1D Histogram Calibration Algorithm”, Mathematical Problems in Engineering, article number 313452, 2014
    32. S. Xie, Y. He, P. Pan, J. Sun and S. Naoi, “Book Inner Boundary Extraction with Modified Active Shape Model”, Pattern Recognition Letters, vol. 45, no. 1 pp. 121–128, 2014
    33. C. Merino-Gracia, M. Mirmehdi and J. Sigut, “Fast Perspective Recovery of Text in Natural Scenes”, Image and Vision Computing, vol. 31, no. 10, pp. 714-724, 2013
    34. L. Tong, Q. Peng, S. Li, H. Zhao and G. Zhan, “Vector constraint and Ncc based Chinese document image mosaic”, Journal of Applied Sciences, vol. 13, no. 9, pp. 1537-1543, 2013
    35. M. Rahnemoonfar and B. Plale, “Automatic performance evaluation of dewarping methods in large scale digitization of historical documents”, 13th ACM/IEEE-CS Joint Conference on Digital Libraries (JCLD'13), pp. 331-334, Indiana, USA, July 2013
    36. L. Tong, G. Zhan, Q. Peng, Y. Li and Y. Li, “Warped Document Image Mosaicing Method Based on Inflection Point Detection and Registration”, 4th International Conference on Multimedia Information Networking and Security (MINES'12), pp. 306-310, Nanjing, Jiangsu, China, November 2012
    37. A.M. Abdu and M.M. Mokji, “A novel approach to a dynamic template generation algorithm for multiple-choice forms”, International Conference on Control System, Computing and Engineering ( ICCSCE'12), pp. 216 - 221, Batu Ferringhi, Penang, November 2012
    38. P. Yang, A. Antonacopoulos, C. Clausner and S. Pletschacher, “Grid-based modelling and correction of arbitrarily warped historical document images for large-scale digitisation”, 1st Workshop on Historical Document Imaging and Processing (HIP'11), pp. 106-111, Beijing, China, September 2011
    39. I. Kastelan, M. Katona, D. Marijan and J. Zloh, “Automated optical inspection system for digital TV sets”, EURASIP Journal on Advances in Signal Processing, pp. 140-140. 2011

  • B. Gatos, N. Stamatopoulos and G. Louloudis, "ICDAR2009 Handwriting Segmentation Contest”, International Journal on Document Analysis and Recognition (IJDAR) vol. 14, no. 1, pp. 25-33, 2011. impact factor: 1.03 Download Paper

  • ICDAR 2009 Handwriting Segmentation Contest was organized in the context of ICDAR2009 conference in order to record recent advances in off-line handwriting segmentation. The contest includes handwritten document images produced by many writers in several languages (English, French, German and Greek). These images are manually annotated in order to produce the ground truth which corresponds to the correct text line and word segmentation result. For the evaluation, a well-established approach is used based on counting the number of matches between the entities detected by the segmentation algorithm and the entities in the ground truth. This paper describes the contest details including the dataset, the ground truth and the evaluation criteria and presents the results of the 12 participating methods as well as of two state-of-the-art algorithms. A description of the winning algorithms is also given.
    1. G.M. Binmakhashen and S.A. Mahmoud, “Document Layout Analysis: A Comprehensive Survey”, ACM Computing Surveys (CSUR), vol 52, no. 6, 2019
    2. B.M.K. Sharma and V.S. Dhaka, “Segmentation of handwritten words using structured support vector machine”, Pattern Analysis and Applications, DOI: https://doi.org/10.1007/s10044-019-00843-x, 2019
    3. B. Barakat, A. Droby, M. Kassis and J. El-Sana, “Text Line Segmentation for Challenging Handwritten Document Images Using Fully Convolutional Network”, 16th International Conference on Frontiers in Handwriting Recognition (ICFHR'18), pp. 374-379, 2018
    4. M.S. Deshmukh, M.P. Patil and S.R. Kolhe, “A hybrid text line segmentation approach for the ancient handwritten unconstrained freestyle Modi script documents”, Imaging Science Journal, vol. 66, no. 7, pp. 433-44, 2018
    5. D. Aldavert and M. Rusinol, “Manuscript text line detection and segmentation using second-order derivatives”, 13th IAPR International Workshop on Document Analysis Systems (DAS'18), pp. 293-298, 2018
    6. T. Gruning, R. Labahn, M. Diem, F. Kleber and S. Fiel, “READ-BAD: A new dataset and evaluation scheme for baseline detection in archival documents”, 13th IAPR International Workshop on Document Analysis Systems (DAS'18), pp. 351-356, 2018
    7. A. Vij and J. Pruthi, “An automated Psychometric Analyzer based on Sentiment Analysis and Emotion Recognition for healthcare”, International Conference on Computational Intelligence and Data Science (ICCIDS'18), pp. 1184-1191, 2018
    8. T. Gruuening, G. Leifert, T. Strauss and R. Labahn, “A Robust and Binarization-Free Approach for Text Line Detection in Historical Documents”, 14th IAPR International Conference on Document Analysis and Recognition (ICDAR'17), pp. 236-241, 2017
    9. H.E. Bahi and A. Zatni, “Segmentation and recognition of text images acquired by a mobile phone”, International Journal of Tomography and Simulation, vol. 30, no. 4, pp. 95-107, 2017
    10. J.L. Pach and P. Bilski, “A Robust Binarization and Text Line Detection in Historical Handwritten Documents Analysis”, International Journal of Computing, vol 3, no. 15, pp. 154-161, 2016
    11. J.P. Pellicer, M.Z. Afzal, M. Liwicki and M.J. Castro-Bleda, “Complete Text Line Extraction with Convolutional Neural Networks and Watershed Transform”, 12th Workshop on Document Analysis Systems (DAS'16), pp. 30-35, Santorini, Greece, 2016
    12. O. Biller, I. Rabaev, K. Kedem, I. Dinsteiz and J.J. El-Sana, “Evolution maps and applications”, PeerJ, vol 2016, no. 1. art. no. e39, 2016
    13. Y. Lin, Y. Song, Y. Li, F. Wang and K. He , “Multilingual corpus construction based on printed and handwritten character separation”, 13th International Conference on Document Analysis and Recognition (ICDAR'15), pp. 826-830, Nancy, France, 2015
    14. A. Asi, R. Cohen, K. Kedem and J. El-Sana, “Simplifying the Reading of Historical Manuscripts”, 13th International Conference on Document Analysis and Recognition (ICDAR'15), pp. 826-830, Nancy, France, 2015
    15. L. Wang, W. Fan, J. Sun, S. Naoi and T. Hiroshi, “Text Line Extraction in Document Images”, 13th International Conference on Document Analysis and Recognition (ICDAR'15), pp. 191-195, Nancy, France, 2015
    16. W. Swaileh, K.A. Mohand and T. Paquet, "Multi-script Iterative Steerable Directional Filtering For Handwritten Text Line Extraction", 5th International Workshop on Multilingual OCR (MOCR'15), pp. 1241-1245, Nancy, France, 2015
    17. S. Chandna, D. Tonne, T. Jejkal, R. Stotzka, C. Krause, P, Vanscheidtc, H. Buschc and A. Prabhunea, “Software Workflow for the Automatic Tagging of Medieval Manuscript Images (SWATI)”, Document Recognition and Retrieval XXII, Vol. 9402, 940206, 2015
    18. M.A. Ramírez-Ortegón, L.L. Ramírez-Ramírez, I.B. Messaoud, V. Märgner, E. Cuevas and R. Rojas, “Document Retrieval with Unlimited Vocabulary”, IEEE Winter Conference on Applications of Computer Vision, Waikoloa Beach, USA 2015
    19. R. Cohen, I. Dinstein, J. El-Sana and K. Kedem, “Using Scale-Space Anisotropic Smoothing for Text Line Extraction in Historical Documents”, 11th International Conference on Image Analysis and Recognition (ICIAR'14), pp. 349-358, 2014
    20. M.A. Ramírez-Ortegón, L.L. Ramírez-Ramírez, I.B. Messaoud, V. Märgner, E. Cuevas and R. Rojas, “A model for the gray-intensity distribution of historical handwritten documents and its application for binarization”, International Journal on Document Analysis and Recognition, vol. 17, no. 2, pp. 139-160, 2014
    21. D. Brodic, Z.N. Milivojevic and D.R. Milivojevic, “Comparison of Two Goal-Oriented Methods for the Evaluation of the Text-Line Segmentation Algorithms”, Prezeglad Elektrotechniczny, SSN 0033-2097, R. 89 NR 6/2013, 2013
    22. D. Brodic, “Methodology for the Evaluation of the Algorithms for Text Line Segmentation Based on Extended Binary Classification”, Measurement Science Review, vol. 11, no. 3, pp. 71-78, 2011

  • N. Nikolaou, M. Makridis, B. Gatos, N. Stamatopoulos and N. Papamarkos, “Segmentation of historical machine-printed documents using Adaptive Run Length Smoothing and skeleton segmentation paths”, Image and Vision Computing, vol. 28, no. 4, pp. 590-604, 2010. impact factor: 1.525Download Paper

  • In this paper, we strive towards the development of efficient techniques in order to segment document pages resulting from the digitization of historical machine-printed sources. This kind of documents often suffer from low quality and local skew, several degradations due to the old printing matrix quality or ink diffusion, and exhibit complex and dense layout. To face these problems, we introduce the following innovative aspects: (i) use of a novel Adaptive Run Length Smoothing Algorithm (ARLSA) in order to face the problem of complex and dense document layout, (ii) detection of noisy areas and punctuation marks that are usual in historical machine-printed documents, (iii) detection of possible obstacles formed from background areas in order to separate neighboring text columns or text lines, and (iv) use of skeleton segmentation paths in order to isolate possible connected characters. Comparative experiments using several historical machine-printed documents prove the efficiency of the proposed technique.
    1. G.M. Binmakhashen and S.A. Mahmoud, “Document Layout Analysis: A Comprehensive Survey”, ACM Computing Surveys (CSUR), vol 52, no. 6, 2019
    2. R. Goyal, R.K. Narula and M. Kumar Jindal, "An experimental technique for ocr line and word segmentation using probability distribution estimation", International Journal of Recent Technology and Engineering, vol. 8, no. 2, pp. 1484-1494
    3. S.R. Narang, M.K. Jindal and M. Kuma, "Line Segmentation of Devanagari Ancient Manuscripts", National Academy of Sciences, India Section A: Physical Sciences, pp. 1-8, 2019
    4. M. Liulei, K. Moydin, A. Dawut and A. Hamdulla, "The Algorithms for Segmentation of Text-Lines in Handwriting Images", 3rd International Conference on Smart City and Systems Engineering (ICSCSE'18), pp. 919-922, 2018
    5. J. Zheng, X. Miao, S.H. Fang, J. Chen and H Jiang, "Enhanced Character Segmentation for Multi-Language Data Plate in Substation Transformer Based on Connected Component Analysis", 15th International Conference on Control, Automation, Robotics and Vision (ICARCV'18), Singapore, pp. 180-185, 2018
    6. E. Kamalanaban, M. Gopinath and S.Premkumar, "Medicine box: Doctor's prescription recognition using deep machine learning", International Journal of Engineering and Technology (UAE), vol. 7, no. 3.34, pp. 114-117, 2018
    7. B. Arizanović and V. Vučković, "Efficient Compression and Decompression Algorithms for OCR Systems", Facta Universitatis, Series: Electronics and Energetics, vol. 31, no. 3, pp. 461-485, 2018
    8. M. Daldali and A. Souhar, "Handwritten Arabic Documents Segmentation into Text Lines using Seam Carving", International Journal of Interactive Multimedia and Artificial Intelligence (IJIMAI), DOI: 10.9781/ijimai.2018.06.002, 2018
    9. P. Sahare and S.B. Dhok, "Multilingual Character Segmentation and Recognition Schemes for Indian Document Images", IEEE Access, DOI: 10.1109/ACCESS.2018.2795104, 2018
    10. L. Melinda, R. Ghanapuram and C. Bhagvati, “Document Layout Analysis Using Multigaussian Fitting”, 14th IAPR International Conference on Document Analysis and Recognition (ICDAR'17), pp. 747-752, 2017
    11. F. Simistira, M. Bouillon, M. Seuret, M. Wursch, M. Alberti, R. Ingold and M. Liwicki, “ICDAR2017 Competition on Layout Analysis for Challenging Medieval Manuscripts”, 14th IAPR International Conference on Document Analysis and Recognition (ICDAR'17), pp. 1361-1370, 2017
    12. V. Vučković and B. Arizanović, "General Character Segmentation Approach for Machine-Typed Documents", 4th International Conference on Electrical, Electronic and Computing Engineering (ETRAN'17), pp. RTI2.2.1-6, 2017
    13. H. Jain and A.P. Kumar, "A Bottom Up Procedure for Text Line Segmentation of Latin Script", International Conference on Advances in Computing, Communications and Informatics (ICACCI'17), 2017
    14. A. Souhar, Y. Boulid, E. Ameur amd M.M. Ouagague, "Watershed transform for text lines extraction on binary Arabic handwritten documents", 2nd International Conference on Big Data Cloud and Applications (BDCA'17), 2017
    15. A. Souhar, Y. Boulid, E. Ameur amd M.M. Ouagague, "Segmentation of Arabic Handwritten Documents into Text Lines using Watershed Transform", International Journal of Interactive Multimedia and Artificial Intelligence, vol. 4, no. 6, pp. 96-102, 2017
    16. N.R. Soora and P.S. Deshpande, "A novel local skew correction and segmentation approach for printed multilingual Indian documents", Alexandria Engineering Journal, https://doi.org/10.1016/j.aej.2017.06.010, 2017
    17. V. Vučković and B. Arizanović, "Efficient Character Segmentation Approach for Machine-Typed Documents", Expert Systems with Applications, http://dx.doi.org/10.1016/j.eswa.2017.03.027, 2017
    18. M. Mehri, P. Héroux, P. Gomez-Krämer and R. Mullot, “Texture feature benchmarking and evaluation for historical document image analysis”, International Journal of Electronics and Communications, DOI: 10.1007/s10032-016-0278-y, 2017
    19. K. Jindal and R. Kumar, “A Novel Shape-Based Character Segmentation Method for Devanagari Script”, Arabian Journal for Science and Engineering, vol. 42, no. 8, pp. 3221-3228, 2017
    20. J. Mtimet and H. Amiri , “A Combined Layer-Based Approach for the Segmentation of Document Images”, Journal of Circuits, Systems, and Computers, vol. 26, no. 10, 2017
    21. P. Sahare and S.B. Dhok, "Review of Text Extraction Algorithms for Scene-text and Document Images", IETE Technical Review, vol 34, no. 2, pp. 144-164, 2017
    22. Y. Yang, R. Pintus, E. Gobbetti E. and H. Rushmeier, "Automatic single page-based algorithms for medieval manuscript analysis", Journal on Computing and Cultural Heritage, vol 10, no. 2, art. no. 9, 2017
    23. A.S. Kavitha, P. Shivakumara, G.H. Kumar and T. Lu, “A New Watershed Model based Syst em for Character Segmentation in Degraded Text Lines”, International Journal of Electronics and Communications, DOI: 10.1016/j.aeue.2016.11.007, 2016
    24. S. Dey, J. Mukherjee and S. Sural, “Consensus-based clustering for document image segmentation”, International Journal on Document Analysis and Recognition (IJDAR), DOI: 10.1007/s10032-016-0275-1, 2016
    25. R. Sharma, “Page Blocks Classification Using Rough Sets”, International Journal of Electrical Electronics & Computer Science Engineering, vol. 3, no. 2, pp. 25-28, 2016
    26. T. Mondal, N. Ragot, J.Y. Ramel and U. Pal, “Flexible Sequence Matching Technique:An Effective Learning-free Approach For word-spotting”, Pattern Recognition, vol. 60, pp. 596-612, 2016
    27. M. Kassis and J. El-Sana, “Complete Text Line Extraction with Convolutional Neural Networks and Watershed Transform”, 12th Workshop on Document Analysis Systems (DAS'16), pp. 239-244, Santorini, Greece, 2016
    28. J.P. Pellicer, M.Z. Afzal, M. Liwicki and M.J. Castro-Bleda, “Complete Text Line Extraction with Convolutional Neural Networks and Watershed Transform”, 12th Workshop on Document Analysis Systems (DAS'16), pp. 30-35, Santorini, Greece, 2016
    29. Z. Liu, F. Cheng and H. Hong, “Identification of Impurities in Fresh Shrimp Using Improved Majority Scheme-Based Classifier”, Journal of Food Analytical Methods, doi: "10.1007/s12161-016-0497-3", 2016
    30. C. Grouin, “Text segmentation of digitized clinical texts”, 10th edition of the Language Resources and Evaluation Conference (LREC'16), pp. 3592-3599, Portorož, Slovenia, 2016
    31. A. Baig, S. Al-Maadeed, A. Bouridane and M. Cheriet, “Direct Unsupervised Text Line Extraction from Colored Historical Manuscript Images Using DCT”, 13th International Conference on Image Analysis and Recognition (ICIAP'16), pp. 753-762, Portuga, 2016
    32. K. Tanaka and K. Terasawa, "Character recognition of medieval English manuscripts supported by a word frequency table", 3rd IAPR Asian Conference on Pattern Recognition (ACPR), Malaysia, pp. 700-704, 2016
    33. R.R Nair, B.U. Kota, I. Nwogu and V. Govindaraju, “Segmentation of highly unstructured handwritten documents using a neural network technique”, 23rd International Conference on Pattern Recognition (ICPR'16), pp. 1291-1296, 2016
    34. N. Venkata Rao, A.S.C.S. Sastry, A.S.N. Chakravarthy, and A.V. Srinivasa Rao, “Analysis of canonical character segmentation technique for ancient Telugu text documents”, Journal of Theoretical and Applied Information Technology vol. 82, no. 2, pp. 311-320, 2015
    35. Y. Lin, Y. Li, Y. Song and F. Wang, “Fast document image comparison in multilingual corpus without OCR”, Multimedia Systems, pp. 1-10, DOI: 10.1007/s00530-015-0484-3, 2015
    36. J. Puigcerver, A.H. Toselli and E. Vidal, “ICDAR2015 Competition on Keyword Spotting for Handwritten Documents”, 13th International Conference on Document Analysis and Recognition (ICDAR'15), pp. 1176-1180, Nancy, France, 2015
    37. M. Javed, P. Nagabhushan and B.B. Chaudhuri, “A Direct Approach for Word and Character Segmentation in Run-Length Compressed Documents with an Application to Word Spotting”, 13th International Conference on Document Analysis and Recognition (ICDAR'15), pp. 216-220, Nancy, France, 2015
    38. J. Wu, F. Da, C. Wang and S. Gai, “Handwritten Character Recognition Based on Weighted Integral Image and Probability Model”, 8th International Conference on Image and Graphics (ICIG'15), China, 2015
    39. R.K. Mohapatra, B. Majhi and S.K. Jena, “Printed Odia Digit Recognition Using Finite Automaton”, 3rd International Conference on Advanced Computing, Networking, and Informatics (ICACNI'15), KIIT University, Orissa, India, 2015
    40. A.B. Shinde and Y.H. Dandawate, “Shirorekha extraction in Character Segmentation for printed devanagri text in Document Image Processing”, 11th IEEE India Conference: Emerging Trends and Innovation in Technology (INDICON'14), Article number 7030535, 2015
    41. M. Mehri, P. Gomez-Krämer, P. Héroux and A. Boucher, “A texture-based pixel labeling approach for historical books”, Pattern Analysis and Applications, Pattern Analysis and Applications, DOI: 110.1007/s10044-015-0451-9 2015
    42. N. Arvanitopoulos and S. Süsstrunk, "Binarization-free Text Line Extraction for Historical Manuscripts", 25th Conference on Digital Humanities (DH2014), pp. 83-85, 2014
    43. P. Duygulu, D. Arifoglu and M. Kalpakli, “Cross-document word matching for segmentation and retrieval of Ottoman divans”, Pattern Analysis and Applications, DOI: 10.1007/s10044-014-0420-8 2014
    44. N. Arvanitopoulos and S. Süsstrunk, "Seam Carving for Text Line Extraction on Color and Grayscale Historical Manuscripts", 4th International Conference on Frontiers in Handwriting Recognition (ICFHR'14), pp. 726-731, Creta, Grecce, September 2014
    45. G. Kamola, M. Spytkowski, M. Paradowski and U. Markowska-Kaczmar, “Image-based logical document structure recognition”, Pattern Analysis and Applications, DOI 10.1007/s10044-014-0412-8, 2014
    46. K. Khankasikam, “Restoration of Degraded Historical Document Image: An Adaptive Multilayer-Information Binarization Technique”, Journal of Information Science and Engineering, vol. 30, no. 5, pp. 1321-1338, 2014
    47. J. Ji, L. Peng and B. Li, “Graph Model Optimization Based Historical Chinese Character Segmentation Method”, 11th IAPR International Workshop on Document Analysis Systems (DAS'14), Tours, France, pp. 282-286, 2014
    48. A. Fischer, M. Baechler, A. Garz, M. Liwicki and R. Ingold, “A Combined System for Text Line Extraction and Handwriting Recognition in Historical Documents”, 11th IAPR International Workshop on Document Analysis Systems (DAS'14), Tours, France, pp. 71-75, 2014
    49. G.F. Chen and J.S. Sheu, “An optical music recognition system for traditional Chinese Kunqu Opera scores written in Gong-Che Notation”, Eurasip Journal on Audio, Speech, and Music Processing, vol. 2014, March 2014, Article number 7 , 2014
    50. M.A. Ramírez-Ortegón, L.L. Ramírez-Ramírez, I.B. Messaoud, V. Märgner, E. Cuevas and R. Rojas, “A model for the gray-intensity distribution of historical handwritten documents and its application for binarization”, International Journal on Document Analysis and Recognition,vol. 17, no. 2, pp. 139-160, 2014
    51. L. Huang, F. Yin and Q. Chen, “Graph-based ensemble method for text line segmentation in offline Chinese handwritten documents”, Journal of Huazhong University of Science and Technology (Natural Science Edition), vol. 42, no.3, pp. 33-36, 2014
    52. G.F. Chen, “Intangible cultural heritage preservation: An exploratory study of digitization of the historical literature of Chinese Kunqu opera librettos”, Journal on Computing and Cultural Heritage (JOCCH), vol. 7, no. 1, Article No. 4 , 2014
    53. J. Ramya, and B. Parvathavarthini, “Feed forward back propagation neural network based character recognition system for tamil palm leaf manuscripts”, Journal of Computer Science, vol. 10, no. 4, pp. 660-670, 2014
    54. V.K. Koppula and A. Negi, “Segmentation of closely set and touching lines in handwritten document images using fringe maps”, International Conference for Convergence of Technology (I2CT'14), Pune, India, 2014
    55. N. Audenaert amd N.M. Houston, “VisualPage: Towards large scale analysis of nineteenth-century print culture”, International Conference on Big Data, Big Data, pp. 9-16, Santa Clara, USA, 2013
    56. M. Javed, P. Nagabhushan and B.B. Chaudhuri, “Extraction of line-word-character segments directly from run-length compressed printed text-documents”, 4th National Conference on Computer Vision, Pattern Recognition, Image Processing and Graphics (NCVPRIPG 2013), art. no. 6776195, Rajasthan, India, 2013
    57. M. Baechler, M. Liwicki and R. Ingold, “Text Line Extraction using DMLP Classifiers for Historical Manuscripts”, 12th International Conference on Document Analysis and Recognition (ICDAR'13), pp. 1029-1033, Washington DC, USA, August 2013
    58. Y. Mei, X. Wang and J. Wang, “A Chinese Character Segmentation Algorithm for Complicated Printed Documents”, International Journal of Signal Processing, Image Processing and Pattern Recognition, vol. 6, no. 3, pp. 91-100, 2013
    59. M.A. Ramírez-Ortegóna, V. Märgnera, E. Cuevasc and R. Rojasb, “An optimization for binarization methods by removing binary artifacts”, Pattern Recognition Letters, vol. 34, no. 11, pp. 1299-1306, 2013
    60. Y. Mei, X. Wang and J. Wang , “An Efficient Character Segmentation Algorithm for Printed Chinese Documents”, Ubiquitous Computing and Multimedia Applications, vol. 22, pp. 183-189, 2013
    61. I.B. Messaoud, H. Amiri, H.E. Abed and V. Märgner, “A multilevel text line segmentation framework for handwritten historical documents”, 13th International Conference on Frontiers in Handwriting Recognition (ICFHR'12), pp. 515-520, Bari, Italy, September 2012
    62. G. Chen, W. Zhang and H. Cui, “Extracting Notes from Chinese Gong-che Notation Musical Score Image Using a Self-adaptive Smoothing and Connected Component Labeling Algorithm”, International Journal of Advancements in Computing Technology, vol. 4, no. 1, pp.86-95, 2012
    63. S. Dey, J. Mukhopadhyay, S. Sural and P. Bhowmick, “Margin Noise Removal from Printed Document Images”, Workshop on Document Analysis and Recognition (DAR'12), Mumbai, ACM Press, 2012
    64. E. Matthaiou and E. Kavallieratou, “An information extraction system from patient historical documents”, 27th Annual ACM Symposium on Applied Computing (SAC'12), pp. 787-791, Italy, March 2012
    65. G. Dang and X. Cheng, “Research on the robustness and integral performance optimization of PI control system”, Journal of Convergence Information Technology, vol. 7, no. 11, pp. 209-216, 2012
    66. Y. Huang, “Research on the line loss rate prediction technology based on the kernel partial least squares”, Journal of Convergence Information Technology, vol. 7, no. 11, pp. 376-383, 2012
    67. A. Garz, A. Fischer, R. Sablatnig and H. Bunke, "Binarization-Free Text Line Segmentation for Historical Documents Based on Interest Point Clustering", 10th IAPR International Workshop on Document Analysis Systems (DAS'12), pp.95-99, Gold Coast, Queensland, Australia, 2012
    68. A.O. Rait and K.S. Venkatesh, “Automatic language-independent indexing of documents using image processing”, 7th International Conference on MEMS, NANO and Smart Systems (ICMENS'11), pp. 817-822, Kuala Lumpur, Malaysia, November 2011
    69. M. Rais, N.A. Goussies and M. Mejail, “Using adaptive run length smoothing algorithm for accurate text localization in images”, 16th Iberoamerican Congress on Pattern Recognition, (CIARP'11), pp. 149-156, Pucón, Chile, November 2011
    70. P. Soujanya1, V.K. Koppula, K. Gaddam and P. Sruthi, “Comparative Study of Text Line Segmentation Algorithms on Low Quality Documents”, International Journal of Computer Science & Informatics, vol II, pp. 110-116, 2011
    71. C. Neudecker, Z.M. Dogan, S. Schlarb, P. Missier, S. Sufi, A. Williams and K. Wolstencroft, “An Experimental Workflow Development Platform for Historical Document Digitisation and Analysis”, 1st Workshop on Historical Document Imaging and Processing (HIP'11), pp. 161-168, Beijing, China, September 2011
    72. M. Zhao, S. Li and J. Kwok, “Text detection in images using sparse representation with discriminative dictionaries”, Image and Vision Computing, vol. 28, no. 12, pp. 1590-1599, 2010.

  • N. Stamatopoulos, B. Gatos and S.J. Perantonis, “A Method for Combining Complementary Techniques for Document Image Segmentation”, Pattern Recognition Journal, vol. 42, no. 12, pp. 3158-3168, 2009. impact factor: 2.554 Download Paper

  • Image segmentation is a major task of handwritten document image processing. Many of the proposed techniques for image segmentation are complementary in the sense that each of them using a different approach can solve different difficult problems such as overlapping, touching components, influence of author or font style etc. In this paper, a combination method of different segmentation techniques is presented. Our goal is to exploit the segmentation results of complementary techniques and specific features of the initial image so as to generate improved segmentation results. Experimental results on line segmentation methods for handwritten documents demonstrate the effectiveness of the proposed combination method.
    1. G.A. Farulla, N. Murru and R. Rossini , “A Fuzzy Approach to Segment Touching Characters”, Expert Systems with Applications, vol. 88, no. 1, pp. 1-13, 2017.
    2. F. Drira and F. LeBourgeois, “Mean-Shift segmentation and PDE-based nonlinear diffusion: toward a common variational framework for foreground/background document image segmentation”, International Journal on Document Analysis and Recognition (IJDAR), DOI: 10.1007/s10032-017-0285-7, 2017.
    3. S. Eskenazi, P. Gomez-Krämer and J.M. Ogier, “A comprehensive survey of mostly textual document segmentation algorithms since 2008”, Pattern Recognition, vol. 67, pp. 1-14, 2017.
    4. N.V. Borse and I.R. Shaikh, “Text Extraction from Handwritten Documents”, International Journal Of Engineering, Education And Technology (ARDIJEET), vol. 3, no.2, 2015.
    5. L. Huang, F. Yin and Q. Chen, “Graph-based ensemble method for text line segmentation in offline Chinese handwritten documents”, Journal of Huazhong University of Science and Technology (Natural Science Edition), vol. 42, no.3, pp. 33-36, 2014.
    6. T. Kathirvalavakumar and M.K. Selvi, “Efficient touching text line segmentation in Tamil script using horizontal projection”, 1st International Conference on Mining Intelligence and Knowledge Exploration (MIKE'03), pp. 279-288, Tamil Nadu, India, December 2013.
    7. N. Modi and K. Jindal, “Text Line detection and Segmentation in Handwritten Gurumukhi Scripts”, International Journal of Advanced Research in Computer Science and Software Engineering, vol. 3, no. 5, pp. 1075-1080, 2013.
    8. Y. Zhang and L. Wu, “Fast document image binarization based on an improved adaptive Otsu's method and destination word accumulation”, Journal of Computational Information Systems, vol. 7, no. 6, pp. 1886-1892, 2011.
Conferences

  • G. Retsinas, G. Louloudis, N. Stamatopoulos, G. Sfikas and B. Gatos, “An Alternative Deep Feature Approach to Line Level Keyword Spotting”, The IEEE Conference on Computer Vision and Pattern Recognition (CVPR'19), pp. 12658-12666, California, USA, 2019.Download Paper

  • Keyword spotting (KWS) is defined as the problem of detecting all instances of a given word, provided by the user either as a query word image (Query-by-Example, QbE) or a query word string (Query-by-String, QbS) in a body of digitized documents. Keyword detection is typically preceded by a preprocessing step where the text is segmented into text lines (line-level KWS). Methods following this paradigm are monopolized by test-time computationally expensive handwritten text recognition (HTR)-based approaches; furthermore, they typically cannot handle image queries (QbE). In this work, we propose a time and storage-efficient, deep feature-based approach that enables both the image and textual search options. Three distinct components, all modeled as neural networks, are combined: normalization, feature extraction and representation of image and textual input into a common space. These components, even if designed on word level image representations, collaborate in order to achieve an efficient line level keyword spotting system. The experimental results indicate that the proposed system is on par with state-of-the-art KWS methods.

  • G. Retsinas, G. Sfikas, G. Louloudis, N. Stamatopoulos and B. Gatos, “Compact Deep Descriptors for Keyword Spotting”, 16th International Conference on Frontiers in Handwriting Recognition (ICFHR'18), pp. 315-320, Niagara Falls, USA, 2018.Download Paper

  • In this work, we present a novel approach for the extraction of deep features from a Convolutional Neural Network (CNN), designed for the task of Keyword Spotting (KWS). The main novelty of our work concerns the generation of a compact descriptor able to simulate the existence/absence of unigrams or bigrams. This is accomplished using a binary, attribute-based representation of a word string together with an appropriate training procedure. Deep features are extracted from the output of the last convolutional layer and are organized into zones in order to incorporate spatial information of the detected attributes. In addition, a novel optimization scheme is proposed which relies on a very effective initialization of the network generating the compact descriptors. Experiments conducted on the IAM dataset prove the efficiency of the novel compact descriptor since the proposed system’s performance in on par with the state-of-the-art.

  • G. Retsinas, G. Sfikas, N. Stamatopoulos, G. Louloudis and B. Gatos, “Exploring critical aspects of CNN-based Keyword Spotting. A PHOCNet study”, 13th IAPR International Workshop on Document Analysis Systems (DAS'18), pp. 13-18, Vienna, Austria, 2018.Download Paper

  • Deep convolutional neural networks are today the new baseline for a wide range of machine vision tasks. The problem of keyword spotting is no exception to this rule. Many successful network architectures and learning strategies have been adapted from other vision tasks to create successful keyword spotting systems. In this paper, we argue that various details concerning this adaptation could be reexamined, to the end of building stronger spotting models. In particular, we examine the usefulness of a pyramidal spatial pooling layer versus a simpler approach, and show that a zoning strategy combined with fixed-size inputs can be just as effective while less computationally expensive. We also examine the usefulness of augmentation, class balancing and ensemble learning strategies and propose an improved network. Our hypotheses are tested with numerical experiments on the IAM document collection, where the proposed network outperforms all other existing models.
    1. G. Dinelli, G. Meoni, E. Rapuano, G. Benelli and L. Fanucci, "An FPGA-Based Hardware Accelerator for CNNs Using On-Chip Memories Only: Design and Benchmarking with Intel Movidius Neural Compute Stick", International Journal of Reconfigurable Computing, https://doi.org/10.1155/2019/7218758, 2019
    2. X. Wang, S. Sun, and L. Xie, "Virtual adversarial training for ds-cnn based small-footprint keyword spotting", IEEE Automatic Speech Recognition and Understanding Workshop (ASRU'19), 2019
    3. X. Wang, S. Sun, C. Shan, J. Hou, L. Xie, S. Li and X. Lei, "Adversarial Examples for Improving End-to-end Attention-based Small-footprint Keyword Spotting", IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 6366-6370, 2019

  • G. Retsinas, G. Louloudis, N. Stamatopoulos, G. Sfikas and B. Gatos, “Nonlinear Manifold Embedding on Keyword Spotting using t-SNE”, 14th International Conference on Document Analysis and Recognition (ICDAR'17), pp. 487-492, Kyoto, Japan, 2017.Download Paper

  • Nonlinear manifold embedding has attracted considerable attention due to its highly-desired property of efficiently encoding local structure, i.e. intrinsic space properties, into a low-dimensional space. The benefit of such an approach is twofold: it leads to compact representations while addressing the often-encountered curse of dimensionality. The latter plays an important role in retrieval applications, such as keyword spotting, where a sorted list of retrieved objects with respect to a distance metric is required. In this work, we explore the efficiency of the popular manifold embedding method t-distributed Stochastic Neighbor Embedding (t-SNE) on the Query-by-Example keyword spotting task. The main contribution of this work is the extension of t-SNE in order to support out-of-sample (OOS) embedding which is essential for mapping query images to the embedding space. The experimental results demonstrate a significant increase in keyword spotting performance when the word similarity is calculated on the embedding space.
    1. H. Wei, H. Zhang and G. Gao, "Word Image Representation Based on Visual Embeddings and Spatial Constraints for Keyword Spotting on Historical Documents", International Conference on Pattern Recognition (ICPR'18), pp. 3616-3621. 2018

  • S. Fiel, F. Kleber, M. Diem, V. Christlein G. Louloudis, N. Stamatopoulos and B. Gatos, “ICDAR2017 Competition on Historical Document Writer Identification (Historical-WI)”, 14th International Conference on Document Analysis and Recognition (ICDAR'17), pp. 1377-1382, Kyoto, Japan, 2017.Download Paper

  • The ICDAR 2017 Competition on Historical Document Writer Identification is dedicated to record the most recent advances made in the field of writer identification. The goal of the writer identification task is the retrieval of pages, which have been written by the same author. The test dataset used in this competition consists of 3600 handwritten pages originating from 13th to 20th century. It contains manuscripts from 720 different writers where each writer contributed five pages. This paper describes the dataset, as well as the details of the competition. Five different institutions submitted six methods which were ranked using identification and retrieval metrics. The paper describes the competition details including the dataset, the evaluation measures used as well as a short description of each submitted method.
    1. S. Das, "A statistical tool based binarization method for document images", Multimedia Tools and Applications vol. 78, no. 19, pp 27449–27462, 2019
    2. M. Dahllöf, "Automatic Scribe Attribution for Medieval Manuscripts", Digital Medievalist, vol. 11, no. 1, pp. 6, 2018
    3. G. Abdeljalil, I. Siddiqi, C. Djeddi and S. Al-Maadeed, “Writer Identification on Historical Documents Using Oriented Basic Image Features”, 16th International Conference on Frontiers in Handwriting Recognition (ICFHR'18), pp. 369-373, 2018

  • G. Louloudis, G. Sfikas, N. Stamatopoulos and B. Gatos, “Word Segmentation using the Student’s-t Distribution”, 12th Workshop on Document Analysis Systems (DAS'16), pp. 78-83, Santorini, Greece, 2016.Download Paper

  • Word segmentation refers to the process of defining the word regions of a text line. It is a critical stage towards word and character recognition as well as word spotting and mainly concerns three basic stages, namely preprocessing, distance computation and gap classification. In this paper, we propose a novel word segmentation method which uses the Student’s-t distribution for the gap classification stage. The main advantage of the Student’s-t distribution concerns its robustness to the existence of outliers. In order to test the efficiency of the proposed method we used the four benchmarking datasets of the ICDAR/ICFHR Handwriting Segmentation Contests as well as a historical typewritten dataset of Greek polytonic text. It is observed that the use of mixtures of Student’s-t distributions for word segmentation outperforms other gap classification methods in terms of Recognition Accuracy and F-Measure. Also, in terms of all examined benchmarks, the Student's-t is shown to produce a perfect segmentation result in significantly more cases than the state-of-the-art Gaussian mixture model.
    1. G. Axler and L. Wolf, "Toward a Dataset-Agnostic Word Segmentation Method", 25th IEEE International Conference on Image Processing (ICIP'18), pp. 2635-2639, 2018

  • G. Retsinas, G. Louloudis, N. Stamatopoulos and B. Gatos, “Keyword Spotting in Handwritten Documents using Projections of Oriented Gradients”, 12th Workshop on Document Analysis Systems (DAS'16), pp. 411-416, Santorini, Greece, 2016.Download Paper

  • In this paper, we present a novel approach for segmentation-based handwritten keyword spotting. The proposed approach relies upon the extraction of a simple yet efficient descriptor which is based on projections of oriented gradients. To this end, a global and a local word image descriptors, together with their combination, are proposed. Retrieval is performed using to the euclidean distance between the descriptors of a query image and the segmented word images. The proposed methods have been evaluated on the dataset of the ICFHR 2014 Competition on handwritten keyword spotting. Experimental results prove the efficiency of the proposed methods compared to several state-of-the-art techniques.
    1. H. El Bahi and A. Zatni, “Text recognition in document images obtained by a smartphone based on deep convolutional and recurrent neural network”, Multimedia Tools and Applications, vol. 78, no. 18, pp 26453–2648, 2019
    2. V. Thakur and H. Sikarwar, “Deep Learning Feature Extraction for Handwritten Keyword Spotting in Historical Documents”, 2nd International Conference on Emerging Trends in Engineering & Applied Science (ICETEAS'19), vol. 5, no. 1, pp. 11 – 15, 2019
    3. P. Shivakumara, S. Roy, H.A. Jalab, R.W. Ibrahim, U. Pal, T. Luc, V. Khare and A. Wahaba, “Fractional means based method for multi-oriented keyword spotting in video/scene/license plate images”, Expert Systems with Applications, https://doi.org/10.1016/j.eswa.2018.08.015
    4. R. Ahmed, W.G. Al-Khatib and S. Mahmoud, “A Survey on handwritten documents word spotting”, International Journal of Multimedia Information Retrieval, DOI: 10.1007/s13735-016-0110-y2016

  • G. Retsinas, G. Louloudis, N. Stamatopoulos and B. Gatos, “Efficient Document Image Segmentation Representation by Approximating Minimum-Link Polygons”, 12th Workshop on Document Analysis Systems (DAS'16), pp. 293-298, Santorini, Greece, 2016.Download Paper

  • The result of a document image segmentation task, e.g. text line or word segmentation, is usually a labeled image with each label corresponding to a different segmented region. For many applications, the segmented regions need to be stored and represented in an efficient way, using simple geometric shapes. A challenging task is to restrict all pixels corresponding to a specific label inside a polygon with a minimum number of vertices. Such a polygon promotes the description simplicity and the storage efficiency, while providing a much more userfriendly representation that can be edited easily. The proposed method is a cost-effective approximation of the minimum-edges polygon problem, computing a contour enclosing only pixels of a certain label and using a greedy algorithm in order to reduce the contour into a minimum-link polygon that retains the separability property between the labeled set of pixels.

  • G. Sfikas, G. Louloudis, N. Stamatopoulos and B. Gatos, “Bayesian mixture models on connected components for Newspaper article segmentation”, ACM Symposium on Document Engineering (DocEng'16), pp. 143 - 146, 2016, Vienna, Austria, 2016.Download Paper

  • In this paper we propose a new method for automated segmentation of scanned newspaper pages into articles. Article regions are produced as a result of merging sub-article level content and title regions. We use a Bayesian Gaussian mixture model to model page Connected Component information and cluster input into subarticle components. The Bayesian model is conditioned on a prior distribution over region features, aiding classification into titles and content. Using a Dirichlet prior we are able to automatically estimate correctly the number of title and article regions. The method is tested on a dataset of digitized historical newspapers, where visual experimental results are very promising.

  • N. Stamatopoulos, G. Louloudis and B. Gatos, “Goal-Oriented Performance Evaluation Methodology for Page Segmentation Techniques”, 13th International Conference on Document Analysis and Recognition (ICDAR'15), pp. 281-285, Nancy, France, 2015.Download Paper

  • Document image segmentation is a fundamental step in the document image analysis pipeline as it affects the accuracy of subsequent processing steps. An objective and realistic evaluation of page segmentation techniques is crucial for a quantitative comparison among them. In this paper, a goal-oriented performance evaluation methodology that calculates a comprehensive evaluation measure SR (Success Rate) is presented. SR measure reflects the entire performance of a page segmentation technique in a concise quantitative manner. It is a pixel-based approach which avoids the dependence on a strictly defined ground-truth. The proposed evaluation measure SR deals only with text regions and is correlated with the percentage of the text information in which the subsequent processing (e.g. text line segmentation and recognition) can be applied successfully.
    1. L. Quirós, C.D. Martínez-Hinarejos, A.H. Toselli and E. Vidal, “Interactive Layout Detection”, Iberian Conference on Pattern Recognition and Image Analysis (IbPRIA'17), pp. 161-168, Faro, Portugal, 2017
    2. S. Eskenazi, P. Gomez-Kramer and J.M. Ogier, “Evaluation of the stability of four document segmentation algorithms”, 12th Workshop on Document Analysis Systems (DAS'16), pp. 215-220, Santorini, Greece, 2016

  • B. Gatos, N. Stamatopoulos, G. Louloudis, G. Sfikas, G. Retsinas, V. Papavassiliou, F. Simistira and V. Katsouros, “GRPOLY-DB: An Old Greek Polytonic Document Image Database”, 13th International Conference on Document Analysis and Recognition (ICDAR'15), pp. 646-650, Nancy, France, 2015.Download Paper

  • Recognition of old Greek document images containing polytonic (multi accent) characters is a challenging task due to the large number of existing character classes (more than 270) which cannot be handled sufficiently by current OCR technologies. Taking into account that the Greek polytonic system was used from the late antiquity until recently, a large amount of scanned Greek documents still remains without full text search capabilities. In order to assist the progress of relevant research, this paper introduces the first publicly available old Greek polytonic database GRPOLY-DB for the evaluation of several document image processing tasks. It contains both machine-printed and handwritten documents as well as annotation with ground-truth information that can be used for training and evaluation of the most common document image processing tasks, i.e., text line and word segmentation, text recognition, isolated character recognition and word spotting. Results using several representative baseline technologies are also presented in order to help researchers evaluate their methods and advance the frontiers of old Greek document image recognition and word spotting.
    1. P.P. Roy, A.K. Bhunia, A. Bhattacharyya and U. Pal, “Word searching in scene image and video frame in multi-script scenario using dynamic shape coding”, Multimedia Tools and Applications, https://doi.org/10.1007/s11042-018-6484-5, 2018
    2. D. Aldavert and M. Rusinol, “Manuscript text line detection and segmentation using second-order derivatives”, 13th IAPR International Workshop on Document Analysis Systems (DAS'18), pp. 293-298, 2018
    3. M. Mehri, P. Héroux, R. Mullot, J.P. Moreux, B. Coüasnon, B. Bertrand and B. Barrett, “HBA 1.0: A Pixel-based Annotated Dataset for Historical Book Analysis”, International Workshop on Historical Document Imaging and Processing (HIP'17), pp. 107-112, 2017
    4. R. Ahmed, W.G. Al-Khatib and S. Mahmoud, “A Survey on handwritten documents word spotting”, International Journal of Multimedia Information Retrieval, DOI: 10.1007/s13735-016-0110-y 2016

  • G. Retsinas, B. Gatos, N. Stamatopoulos and G. Louloudis, “Isolated Character Recognition using Projections of Oriented Gradients”, 13th International Conference on Document Analysis and Recognition (ICDAR'15), pp. 336-340, Nancy, France, 2015.Download Paper

  • In this paper, we present a new approach for off-line isolated character recognition. The proposed method relies upon the application of a projection-based feature extraction stage, which resembles the Radon transform, on both the original image and a set of generated images corresponding to different gradient orientations of the original image. For the classification stage, Support Vectors Machines (SVM) are used. The proposed method is evaluated using one typewritten (GRPOLY-DB - Historical Greek) and two handwritten (CIL - Greek, CEDAR - English) publicly available databases. Experimental results prove the efficiency of the proposed method compared to several state-of- the-art techniques.
    1. S. Kaur and S. Rani, “Isolated Curved Gurmukhi Character Recognition Using Projection of Gradient”, International Journal of Computational Intelligence Research (IJCIR), vol. 13, no. 6, pp. 1387-1396, 2017

  • G. Retsinas, B. Gatos, A. Antonacopoulos, G. Louloudis and N. Stamatopoulos, “Historical Typewritten Document Recognition Using Minimal User Interaction”, 3rd International Workshop on Historical Document Imaging and Processing (HIP’15), pp. 31-38, Nancy, France, 2015.Download Paper

  • Recognition of low-quality historical typewritten documents can still be considered as a challenging and difficult task due to several issues i.e. the existence of faint and degraded characters, stains, tears, punch holes etc. In this paper, we exploit the unique characteristics of historical typewritten documents in order to propose an efficient recognition methodology that requires minimum user interaction. It is based on a pre-processing stage in order to enhance the quality and extract connected components, on a semi-supervised clustering for detecting the most representative character samples and on a segmentation-free recognition stage based on a template matching and cross-correlation technique. Experimental results prove that even with minimum user interaction, the proposed method can lead to promising accuracy results.

  • N. Stamatopoulos, G. Louloudis and B. Gatos, “A Novel Transcript Mapping Technique for Handwritten Document Images”, 14th International Conference on Frontiers in Handwriting Recognition (ICFHR'14), pp. 41-46, Creta, Grecce, September 2014.Download Paper

  • Transcript mapping refers to the process of aligning meaningful units of a handwritten document image (e.g. text lines, words, characters) with the corresponding transcription information. It has many applications such as (i) fast generation of ground truth at different granularity levels and (ii) indexing handwritten collections for document retrieval. In this paper, a novel transcript mapping technique is proposed which is guided by the number of words as well as the characters per word of a text line. The proposed method combines the results of a local and a global approach using a scoring algorithm. The efficiency of the proposed method is demonstrated by experimentation conducted on a known, publicly available dataset, achieving word level alignment accuracy of 99.48%.
    1. R. Cohen, I. Rabaev, J. El-Sana, K. Kedem and I. Dinstein, “Aligning transcript of historical documents using energy minimization”, 13th International Conference on Document Analysis and Recognition (ICDAR'15), pp. 266-270, Nancy, France, 2015

  • B. Gatos, G. Louloudis and N. Stamatopoulos, “Segmentation of Historical Handwritten Documents into Text Zones and Text Lines”, 14th International Conference on Frontiers in Handwriting Recognition (ICFHR'14), pp. 464-469, Creta, Grecce, September 2014.Download Paper

  • In order to achieve accurate text recognition performance for historical handwritten document images, robust and efficient page segmentation is necessary. In this paper, we propose a text zone detection followed by a text line segmentation method suitable for historical handwritten documents. Our aim is to handle several challenging cases such as horizontal and vertical rule lines overlapping with the text, two column documents and characters of different text lines touching vertically. For text zone detection, we analyze vertical rule lines, connected components as well as vertical white runs while for text line segmentation, we enhance an existing approach based on Hough transform in order to better treat cases of vertical connected characters. Both methods have been proved very promising after an evaluation using a set of historical handwritten documents.
    1. M. Pastor, "Text baseline detection, a single page trained system", Pattern Recognition, vol. 94, pp. 149-161, 2019
    2. S.R. Narang, M.K. Jindal and M. Kumar, "Drop flow method: an iterative algorithm for complete segmentation of Devanagari ancient manuscripts", Multimedia Tools and Applications. DOI: 10.1007/s11042-019-7620-6, 2019
    3. S. Capobianco, L. Scommegna and S. Marinai, "Historical handwritten document segmentation by using a weighted loss", 8th IAPR TC3 workshop on Artificial Neural Networks for Pattern Recognition (ANNPR'18), pp. 395-406, 2018
    4. P. Kahle, S. Colutto, G. Hackl and G. Mühlberger, "Transkribus-a Service Platform for Transcription, Recognition and Retrieval of Historical Documents", 14th International Conference on Document Analysis and Recognition (ICDAR'17), pp. 19-24, 2017
    5. A. Fawzi, M. Pastor and C.D. Martínez-Hinarejos, "Baseline Detection on Arabic Handwritten Documents", ACM Symposium on Document Engineering (DocEng'17), pp. 193-196, 2017
    6. V. Vučković and B. Arizanović, " Efficient Character Segmentation Approach for Machine-Typed Documents", Expert Systems with Applications, doi.org/10.1016/j.eswa.2017.03.027, 2017
    7. A.S. Kavitha, P. Shivakumara, G.H. Kumar and T. Lu, “Text segmentation in degraded historical document images”, Egyptian Informatics Journal, doi:10.1016/j.eij.2015.11.003, 2015
    8. V. Romero, J.A. Sanchez, V. Bosch, K. Depuydt and J. de Does, “Influence of Text Line Segmentation in Handwritten Text Recognition”, 13th International Conference on Document Analysis and Recognition (ICDAR'15), pp. 536-540, Nancy, France, 2015

  • I. Pratikakis, K. Zagoris, B. Gatos, G. Louloudis and N. Stamatopoulos, “ICFHR 2014 Competition on Handwritten KeyWord Spotting (H-KWS 2014)”, 14th International Conference on Frontiers in Handwriting Recognition (ICFHR'14), pp. 814-819, Creta, Grecce, September 2014.Download Paper

  • H-KWS 2014 is the Handwritten Keyword Spotting Competition organized in conjunction with ICFHR 2014 conference. The main objective of the competition is to record current advances in keyword spotting algorithms using established performance evaluation measures frequently encountered in the information retrieval literature. The competition comprises two distinct tracks, namely, a segmentation-based and a segmentationfree track. Five (5) distinct research groups have participated in the competition with three (3) methods for the segmentationbased track and four (4) methods for the segmentation-free track. The benchmarking datasets that were used in the contest contain both historical and modern documents from multiple writers. In this paper, the contest details are reported including the evaluation measures and the performance of the submitted methods along with a short description of each method.
    1. A. Hast and E. Vats, “Radial line fourier descriptor for historical handwritten text representation”, Journal of WSCG, vol. 26, no. 1, pp. 31-40, 2018
    2. A.H. Toselli, E. Vidal, J. Puigcerver and E. Noya-Garcia, “Probabilistic multi-word spotting in handwritten text images”, Pattern Analysis and Applications, DOI https://doi.org/10.1007/s10044-018-0742-z, 2018
    3. A. Santoro, C.D. Stefano and A. Marcelli, “Assisted Transcription of Historical Documents by Keyword Spotting: A Performance Model”, 14th IAPR International Conference on Document Analysis and Recognition (ICDAR'17), pp. 971-976, 2017
    4. M.L. Bouined, H. Nemmour and Y. Chibani, "New gradient descriptor for keyword spotting in handwritten documents", 3rd International Conference on Advanced Technologies for Signal and Image (ATSIP'17), 2017
    5. P.P. Roy, A.K. Bhunia, A. Das, P. Dhar and U. Pal, "Keyword spotting in doctor's handwriting on medical prescriptions", Expert Systems with Applications, vol. 76, no. 15, pp. 113-128, 2017
    6. A. Santoro, A. Parziale and A. Marcelli, “A human in the loop approach to historical handwritten documents transcription”, 15th International Conference on Frontiers in Handwriting Recognition (ICFHR'16), pp. 222-227, 2016
    7. R. Ahmed, W.G. Al-Khatib and S. Mahmoud , “A Survey on handwritten documents word spotting”, International Journal of Multimedia Information Retrieval, DOI: 10.1007/s13735-016-0110-y2016
    8. A. Hast and A. Fornes, “A Segmentation-free Handwritten Word Spotting Approach by Relaxed Feature Matching”, 12th Workshop on Document Analysis Systems (DAS'16), pp. 150-155, Santorini, Greece, 2016
    9. J. Puigcerver, A.H. Toselli and E. Vidal, “ICDAR2015 Competition on Keyword Spotting for Handwritten Documents”, 13th International Conference on Document Analysis and Recognition (ICDAR'15), pp. 1176-1180, Nancy, France, 2015
    10. E. Vidal, A.H. Toselli and J. Puigcerver, “High Performance Query-by-Example Keyword Spotting Using Query-by-String Techniques”, 13th International Conference on Document Analysis and Recognition (ICDAR'15), pp. 741-745, Nancy, France, 2015
    11. S. Yao, Y. Wen and Y. Lu, “HoG based Two-Directional Dynamic Time Warping for Handwritten Word Spotting”, 13th International Conference on Document Analysis and Recognition (ICDAR'15), pp. 161-165, Nancy, France, 2015

  • B. Gatos, N. Stamatopoulos, G. Louloudis and S.J. Perantonis, “H-DocPro: a document image processing platform for historical documents”, 1st International Conference on Digital Access to Textual Cultural Heritage (DATeCH '14), pp. 131-136, Madrid, Spain, May 2014.Download Paper

  • In this paper, we introduce the H-DocPro platform which is a publicly available document image processing platform for historical documents. H-DocPro is a result of our recent and on-going research on historical document image processing and has been developed in order to monitor the successive application of several new or state-of-the-art document image processing methods. It is an open architecture software platform that permits several document image processing modules and methods (e.g. binarization, image enhancement, page split) to be utilized in an easy to define processing workflow. We provide detailed information on how to use H-DocPro, the available modules and methods as well as the way one can add his own components exploiting the open architecture form of the platform. Representative examples and experimental results using large sets of historical document images demonstrate the efficiency of H-DocPro methods.

  • N. Stamatopoulos, G. Louloudis, B. Gatos, U. Pal and A. Alaei, “ICDAR2013 Handwriting Segmentation Contest”, 12th International Conference on Document Analysis and Recognition (ICDAR'13), pp. 1402-1406, Washington DC, USA, August 2013.Download Paper

  • This paper presents the results of the Handwriting Segmentation Contest that was organized in the context of the ICDAR2013. The general objective of the contest was to use well established evaluation practices and procedures to record recent advances in off-line handwriting segmentation. Two benchmarking datasets, one for text line and one for word segmentation, were created in order to test and compare all submitted algorithms as well as some state-of-the-art methods for handwritten document image segmentation in realistic circumstances. Handwritten document images were produced by many writers in two Latin based languages (English and Greek) and in one Indian language (Bangla, the second most popular language in India). These images were manually annotated in order to produce the ground truth which corresponds to the correct text line and word segmentation results. The datasets of previously organized contests (ICDAR2007, ICDAR2009 and ICFHR2010 Handwriting Segmentation Contests) along with a dataset of Bangla document images were used as training dataset. Eleven methods are submitted in this competition. A brief description of the submitted algorithms, the evaluation criteria and the segmentation results obtained from the submitted methods are also provided in this manuscript.
    1. B.M.K. Sharma and V.S. Dhaka, “Segmentation of handwritten words using structured support vector machine”, Pattern Analysis and Applications, DOI: https://doi.org/10.1007/s10044-019-00843-x, 2019
    2. S. Kundu, S. Paul, S.K. Bera, A. Abraham and R. Sarkar, "Text-line Extraction from Handwritten Document Images using GAN", Expert Systems with Applications, vol. 140, 2020
    3. F. Can F and A. Yilmaz, “Hybrid handwriting character recognition with transfer deep learning”, 27th Signal Processing and Communications Applications Conference (SIU'19), Article number 8806364, 2019
    4. M.A. Garcia-Calderon, R.A. Garcia-Hernandez and Y. Ledeneva, “Providing order to the handwritten TLS task: A complexity index”, Journal of Intelligent and Fuzzy Systems, vol. 36, no. 5, pp. 4621-4631, 2019
    5. C. Adak, B.B. Chaudhuri and M. Blumenstein, "An empirical study on writer identification and verification from intra-variable individual handwriting”, IEEE Access, vol. 7, pp. 24738-24758, 2019
    6. M. Pastor, "Text baseline detection, a single page trained system", Pattern Recognition, vol. 94, pp. 149-161, 2019
    7. G. Axler and L. Wolf, "Toward a Dataset-Agnostic Word Segmentation Method", 25th IEEE International Conference on Image Processing (ICIP'18), pp. 2635-2639, 2018
    8. R. Saabni, “Robust and efficient text‐line extraction by local minimal sub-seams”, 2nd International Symposium on Computer Science and Intelligent Control (ISCSIC'18), 2018
    9. M.W.A. Kesiman, D. Valy, J.C. Burie1, E. Paulus, M. Suryani, S. Hadi, M. Verleysen, S. Chhun and J.M. Ogier, “ICFHR 2018 Competition On Document Image Analysis Tasks for Southeast Asian Palm Leaf Manuscripts”, 16th International Conference on Frontiers in Handwriting Recognition (ICFHR'18), pp. 483-488, 2018
    10. V. Bosch, V. Romero, A.H. Toselli and E. Vidal, “Text Line Extraction Based on Distance Map Features and Dynamic Programming”, 16th International Conference on Frontiers in Handwriting Recognition (ICFHR'18), pp. 357-362, 2018
    11. C. Adak, B.B. Chaudhuri and M. Blumenstein, “A Study on Idiosyncratic Handwriting with Impact on Writer Identification”, 16th International Conference on Frontiers in Handwriting Recognition (ICFHR'18), pp. 193-198, 2018
    12. B. Barakat, A. Droby, M. Kassis and J. El-Sana, “Text Line Segmentation for Challenging Handwritten Document Images Using Fully Convolutional Network”, 16th International Conference on Frontiers in Handwriting Recognition (ICFHR'18), pp. 374-379, 2018
    13. T. Gruning, R. Labahn, M. Diem, F. Kleber and S. Fiel, “READ-BAD: A new dataset and evaluation scheme for baseline detection in archival documents”, 13th IAPR International Workshop on Document Analysis Systems (DAS'18), pp. 351-356, 2018
    14. B. Arizanović and V. Vučković, "Efficient Compression and Decompression Algorithms for OCR Systems", Facta Universitatis, Series: Electronics and Energetics, vol. 31, no. 3, pp. 461-485, 2018
    15. G. Renton, Y. Soullard, C. Chatelain, S. Adam, C. Kermorvant and T. Paquete, "Fully Convolutional Network with dilated convolutions for Handwritten text line segmentation", International Journal on Document Analysis and Recognition (IJDAR), 2018
    16. Q.N. Vo, S.H. Kim, H.J. Yang and G. Lee, "Text line segmentation using a fully convolutional network in handwritten document images", IET Image Processing, vol. 12, no. 3, pp. 438-446, 2018
    17. M.W.A. Kesiman, D. Valy, J.C. Burie, E. Paulus, M. Suryani, S. Hadi, M. Verleysen, S. Chhun and J.M. Ogier, "Benchmarking of Document Image Analysis Tasks for Palm Leaf Manuscripts from Southeast Asia", Journal of Imaging, vol. 4, no. 2, article number 43, 2018
    18. D. Valy D. M. Verleysen and K. Sok, “Line segmentation for grayscale text images of Khmer palm leaf manuscripts”, 7th International Conference on Image Processing Theory, Tools and Applications (IPTA'17), pp. 1-6, 2017
    19. A. Rehman, "Offline touched cursive script segmentation based on pixel intensity analysis: Character segmentation based on pixel intensity analysis", International Conference on Digital Information Management (ICDIM'17), pp. 324-327, 2017
    20. Ι. Setitra, A. Meziane, Ζ. Hadjadj and, N. Bengherbia, "Text line segmentation in handwritten documents based on connected components trajectory generation", 6th International Conference on Pattern Recognition Applications and Methods (ICPRAM'17), pp. 222-234, 2017
    21. I.M. Amer, S. Hamdy and M.G.M. Mostafa, "Deep Arabic document layout analysis", 8th International Conference on Intelligent Computing and Information Systems (ICICIS'17), pp. 224-231, 2017
    22. K. Thangairulappan and K. Mohan, "Efficient segmentation of printed Tamil script into characters using projection and structure ", 4th International Conference on Image Information Processing (ICIIP'17), pp. 484-489, 2017
    23. V. Chavan and F. Mehrotra, "Text line segmentation of multilingual handwritten documents using fourier approximation ", 4th International Conference on Image Information Processing (ICIIP'17) , pp. 250-255, 2017
    24. W Jia, L. Sun, Z. Zhong, X. Mo, G. Ma and Q. Huo, “A Robust Approach to Detecting Text from Images of Whiteboards and Handwritten Notes”, 14th IAPR International Conference on Document Analysis and Recognition (ICDAR'17), pp. 813-818, 2017
    25. C. Adak, B.B. Chaudhuri and M. Blumenstein, “Legibility and Aesthetic Analysis of Handwriting”, 14th IAPR International Conference on Document Analysis and Recognition (ICDAR'17), pp. 175-182, 2017
    26. T. Gruuening, G. Leifert, T. Strauss and R. Labahn, “A Robust and Binarization-Free Approach for Text Line Detection in Historical Documents”, 14th IAPR International Conference on Document Analysis and Recognition (ICDAR'17), pp. 236-241, 2017
    27. K.C. Nguyen, C.T. Nguyen and M. Nakagawa, "A segmentation method of single- and multiple-touching characters in offline handwritten Japanese text recognition", IEICE Transactions on Information and System, vol. E100D, no. 12, pp. 2962-2972, 2017
    28. B. Ahn, J. Ryu, H.I. Koo and N.I. Cho, "Textline detection in degraded historical document images", EURASIP Journal on Image and Video Processing, vol. 2017, no. 1, pp. 82, 2017
    29. V. Vučković and B. Arizanović, "General Character Segmentation Approach for Machine-Typed Documents", 4th International Conference on Electrical, Electronic and Computing Engineering (ETRAN'17), pp. RTI2.2.1-6, 2017
    30. Y. Li, L. Ma, L. Duan, J. Wu, J. Yang, Q. Hu, M.M. Cheng, L. Wang, Q. Liu, X. Bai and D. Meng, "A Text-Line Segmentation Method for Historical Tibetan Documents Based on Baseline Detection", Chinese Conference on Computer Vision (CCCV'17), pp. 356-367, 2017
    31. R. Pramanik and S. Bag, "Linear Curve Fitting-Based Headline Estimation in Handwritten Words for Indian Scripts", International Conference on Pattern Recognition and Machine Intelligence (PReMI'17), pp. 116-123, 2017
    32. R. Pramanik and S. Bag, "Shape Decomposition-based Handwritten Compound Character Recognition for Bangla OCR", Journal of Visual Communication and Image Representation, vol. 50, pp. 123-134, 2017
    33. H. Jain and A.P. Kumar, "A Bottom Up Procedure for Text Line Segmentation of Latin Script", International Conference on Advances in Computing, Communications and Informatics (ICACCI'17), 2017
    34. L.M. Francisa and N. Sreenatha , "TEDLESS-Text Detection using Least-Square SVM from Natural Scene", Journal of King Saud University - Computer and Information Sciences, https://doi.org/10.1016/j.jksuci.2017.09.001, 2017
    35. A. Souhar, Y. Boulid, E. Ameur amd M.M. Ouagague, "Watershed transform for text lines extraction on binary Arabic handwritten documents", 2nd International Conference on Big Data Cloud and Applications (BDCA'17), 2017
    36. Q.N. Vo, S.H. Kim, H.J. Yang and G. Lee, "Binarization of Degraded Document Images based on Hierarchical Deep Supervised Network", Pattern Recognition, DOI: doi.org/10.1016/j.patcog.2017.08.025, 2017
    37. A. Souhar, Y. Boulid, E. Ameur and M.M. Ouagague, "Segmentation of Arabic Handwritten Documents into Text Lines using Watershed Transform", International Journal of Interactive Multimedia and Artificial Intelligence, vol. 4, no. 6, pp. 96-102, 2017
    38. R. Amarnath and P. Nagabhushan, "Spotting Separator Points at Line Terminals in Compressed Document Images for Text-line Segmentation", International Journal of Computer Applications, vol. 172, no. 4, pp. 40-47, 2017
    39. C. Adak, B.B. Chaudhuri and M. Blumenstein, "Impact of struck-out text on writer identification", International Joint Conference on Neural Networks (IJCNN'17), pp. 1465-1471, 2017
    40. V. Vučković and B. Arizanović, "Efficient Character Segmentation Approach for Machine-Typed Documents", Expert Systems with Applications, http://dx.doi.org/10.1016/j.eswa.2017.03.027, 2017
    41. S. Eskenazi, P. Gomez-Krämer and J.M. Ogier, “A comprehensive survey of mostly textual document segmentation algorithms since 2008”, Pattern Recognition, vol. 67, pp. 1-14, 2017
    42. P. Sahare and S.B. Dhok, "Review of Text Extraction Algorithms for Scene-text and Document Images", IETE Technical Review, vol 34, no. 2, pp. 144-164, 2017
    43. H.I. Koo, “Text-line Detection in Camera-captured Document Images using the State Estimation of Connected Components”, IEEE Transactions on Image Processing, vol. 25, no. 11, pp. 5358-5369, 2016
    44. S. Hofmann, M. Gropp, D. Bernecker, C. Pollin, A. Maier and V. Christlein, “Vesselness for text detection in historical document images”, International Conference on Image Processing (ICIP'16), pp. 3259-3263, 2016
    45. Y. Boulid, A. Souhar and M.Y. Elkettani, “Arabic handwritten text line extraction using connected component analysis from a multi agent perspective”, International Conference on Intelligent Systems Design and Applications (ISDA'16), pp. 80-87, 2016
    46. B.B. Chaudhuri and C. Adak, “An Approach for Detecting and Cleaning of Struck-out Handwritten Text”, Pattern Recognition, doi:10.1016/j.patcog.2016.07.032, 2016
    47. B. Biswas, U. Bhattacharya and B.B. Chaudhuri, “A Robust Scheme for Extraction of Text Lines from Handwritten Documents”, International Conference on Computer Vision & Image Processing (CVIP'16), pp. 107-116, 2016
    48. K. Kadam, D. Phadatare, A. Mali and P. Nimbalkar and P Gode, “Detection of Word by Inter - Intra Gap Technique for Handwritten Documents”, International Journal of Advanced Research in Computer Engineering & Technology (IJARCET), vol. 5, no. 4, pp. 936-939, 2016
    49. P. Choudhary and N. Nain, “A Four-Tier Annotated Urdu Handwritten Text Image Dataset for Multidisciplinary Research on Urdu Script”, ACM Transactions on Asian and Low-Resource Language Information Processing, vol. 15, no. 4, article no. 26, 2016
    50. Y. Boulid, A. Souhar and M.Y. Elkettani, “Segmentation approach of Arabic manuscripts text lines based on multi agent systems”, International Journal of Computer Information Systems and Industrial Management Applications, vol. 8, no. 1, pp. 173-183, 2016
    51. Y. Boulid, A. Souhar and M.Y. Elkettani, “Detection of Text Lines of Handwritten Arabic Manuscripts using Markov Decision Processes”, International Journal of Interactive Multimedia and Artificial Inteligence, vol. 4, no. 1, pp. 31-36, 2016
    52. Z. Huang, J. Gu, G. Meng and C. Pan, "Text line extraction of curved document images using hybrid metric", 3rd IAPR Asian Conference on Pattern Recognition (ACPR), Malaysia, pp. 251-255, 2016
    53. B. Moysse, J. Louradour, C. Kermorvant and C. Wolf, “Learning text-line localization with shared and local regression neural networks”, 15th International Conference on Frontiers in Handwriting Recognition (ICFHR'16), pp. 1-6, 2016
    54. M.W.A. Kesiman, J.C. Burie and J.M. Ogier, “A New Scheme for Text Line and Character Segmentation from Gray Scale Images of Palm Leaf Manuscript”, 15th International Conference on Frontiers in Handwriting Recognition (ICFHR'16), pp. 325-330, 2016
    55. C. Adak, B. B. Chaudhuri and M. Blumenstein, “Offline Cursive Bengali Word Recognition using CNNs with a Recurrent Model”, 15th International Conference on Frontiers in Handwriting Recognition (ICFHR'16), pp. 429-434, 2016
    56. A. Abliz, W. Simayi, K. Moydin and A. Hamdulla, “Survey on Methods for Basic Unit Segmentation in Off-Line Handwritten Text Recognition”, International Journal of Future Generation Communication and Networking vol. 9, no. 11, pp. 137- 152, 2016
    57. X. Han, H. Yao and G. Zhong, “Handwritten Text Line Segmentation by Spectral Clustering”, Eighth International Conference on Graphic and Image Processing (ICGIP 2016), 102251A, 2016
    58. T. Wilkinson and A. Brun, “A Novel Word Segmentation Method Based on Object Detection and Deep Learning”, Advances in Visual Computing, 9474, pp. 231-240, 2015
    59. Z. Harbi, Y. Hicks, R. Setchi and A. Bayer, “Segmentation of Clock Drawings Based on Spatial and Temporal Features”, Procedia Computer Science, vol. 60, pp. 1640-1648, 2015
    60. E. Kavallieratou, “Word Segmentation Using Wigner-Ville Distribution”, 13th International Conference on Document Analysis and Recognition (ICDAR'15), pp. 701-705, Nancy, France, 2015
    61. V. Romero, J.A. Sanchez, V. Bosch, K. Depuydt and J. de Does, “Influence of Text Line Segmentation in Handwritten Text Recognition”, 13th International Conference on Document Analysis and Recognition (ICDAR'15), pp. 536-540, Nancy, France, 2015
    62. W. Swaileh, K.A. Mohand and T. Paquet, "Multi-script Iterative Steerable Directional Filtering For Handwritten Text Line Extraction", 5th International Workshop on Multilingual OCR (MOCR'15), pp. 1241-1245, Nancy, France, 2015
    63. M.K. Sharma and V.P. Dhaka , “Pixel plot and trace based segmentation method for bilingual handwritten scripts using feedforward neural network”, Neural Computing and Applications, DOI 10.1007/s00521-015-1972-2, 2015
    64. K. Mullick, S. Banerjee and U. Bhattacharya, “An efficient line segmentation approach for handwritten Bangla document image”, 8th International Conference on Advances in Pattern Recognition (ICAPR'15), no. 7050679, 2015
    65. R. Pintus, Y. Yang, H. Rushmeier, “ATHENA: Automatic text height extraction for the analysis of text lines in old handwritten manuscripts”, Journal of Computing and Cultural Heritage, vol. 8, no. 1, 2015
    66. B.L. Davis, W.A. Barrett and S.D. Swingle, “Min-cut segmentation of cursive handwriting in tabular documents”, Document Recognition and Retrieval XXII, Vol. 940208, 2015
    67. J. Ryu, H.I. Koo and N.I. Cho, “Word segmentation method for handwritten documents based on structured learning”, IEEE Signal Processing Letters, vol. 22, no. 8, pp. 1161-1165, 2015
    68. R. Cohen, I. Dinstein, J. El-Sana and K. Kedem, “Using Scale-Space Anisotropic Smoothing for Text Line Extraction in Historical Documents”, 11th International Conference on Image Analysis and Recognition (ICIAR'14), pp. 349-358, 2014
    69. X. Zhang and C.L. Tan, "Text Line Segmentation for Handwritten Documents Using Constrained Seam Carving", 4th International Conference on Frontiers in Handwriting Recognition (ICFHR'14), pp. 98-103, Creta, Grecce, September 2014
    70. Y. Elarian, A. Zidouri and W. Al-Khatib, "Ground-truth and Metric for the Evaluation of Arabic Handwritten Character Segmentation", 4th International Conference on Frontiers in Handwriting Recognition (ICFHR'14), pp. 766-770, Creta, Grecce, September 2014
    71. D. Fernández-Mota, J. Almazán, N. Cirera, A. Fornés and J. Lladós, "BH2M : the Barcelona Historical Handwritten Marriages database", International Conference on Pattern Recognition. pp. 256-261, 2014
    72. M. Diem, F. Kleber and R. Sablatnig, "Ruling analysis and classification of torn documents", In Proceedings of the 2014 ACM symposium on Document engineering (DocEng '14), Colorado, USA, pp. 63-72, 2014
    73. J. Ryu, H.I. Koo and N.I. Cho, “Language-Independent Text-Line Extraction Algorithm for Handwritten Documents”, IEEE Signal Processing Letters, vol. 21, no. 9, pp. 1115-1119, 2014
    74. D. Fernández-Mota, J. Lladós and A. Fornés, “A graph-based approach for segmenting touching lines in historical handwritten documents”, International Journal on Document Analysis and Recognition, vol. 17, no. 3, pp. 293-312, 2014
    75. A. Lemaitre, J. Camillerapp and B. Coüasnon, “Handwritten text segmentation using blurred image”, Proceedings of SPIE - The International Society for Optical Engineering, vol. 9021, article number 90210D, Document Recognition and Retrieval XXI, San Francisco, United States, 2014

  • A. Papandreou, B. Gatos, G. Louloudis and N. Stamatopoulos, “ICDAR2013 Document Image Skew Estimation Contest (DISEC’13)”, 12th International Conference on Document Analysis and Recognition (ICDAR'13), pp. 1444-1448, Washington DC, USA, August 2013.Download Paper

  • The detection and correction of document skew is one of the most important document image analysis steps. The ICDAR2013 Document Image Skew Estimation Contest (DISEC’13) is the first contest which is dedicated to record advances in the field of skew estimation using well established evaluation performance measures on a variety of printed document images. The benchmarking dataset that is used contains 1550 images that were obtained from various sources such as newspapers, scientific books and travel guides. The document images contain figures, tables, diagrams, architectural plans, electrical circuits, and they are written in various languages such as English, Chinese and Greek. This paper describes the details of the contest including the evaluation measures used as well as the performance of the twelve methods submitted by ten different groups along with a short description of each method.
    1. O. Boudraa, W.K. Hidouci and D. Michelucci, “Using skeleton and Hough transform variant to correct skew in historical documents”, Mathematics and Computers in Simulation, DOI: https://doi.org/10.1016/j.matcom.2019.05.009, 2019
    2. M.A. Garcia-Calderon, R.A. Garcia-Hernandez and Y. Ledeneva, “Unsupervised multi-language handwritten text line segmentation”, Journal of Intelligent and Fuzzy Systems, vol. 34, no 5, pp. 2901-2911. 2018.
    3. B. Sharada, S.N. Sushma and Bharathlal, “Keyword Spotting in Historical Devanagari Manuscripts by Word Matching”, Data Analytics and Learning (DAL'18), pp. 65, India, 2018.
    4. O. Boudraa, W.K. Hidouci and D. Michelucci, “An improved skew angle detection and correction technique for historical scanned documents using morphological skeleton and progressive probabilistic hough transform”, 5th International Conference on Electrical Engineering - Boumerdes (ICEE-B'17), pp. 1-16, 2017
    5. D. Brodic and Z.N. Milivojevic, “Text skew detection using combined entropy algorithm”, Information Technology and Control, vol. 46, no. 3, pp. 308-318, 2017
    6. V. Vučković and B. Arizanovic, “Automatic Document Skew Pre-processor for Character Segmentation Algorithm”, Facta Universitatis, Electronics and Energetics, vol. 30, no. 4, pp. 611-625, 2017
    7. S. Eskenazi, P. Gomez-Kramer and J.M. Ogier, “Let’s be done with thresholds!”, 13th International Conference on Document Analysis and Recognition (ICDAR'15), pp. 851-855, Nancy, France, 2015
    8. F. Stahlberg and S. Vogel, “Document Skew Detection Based on Hough Space Derivatives”, 13th International Conference on Document Analysis and Recognition (ICDAR'15), pp. 366-370, Nancy, France, 2015
    9. R. Pintus, Y. Yang, E. Gobbetti and H. Rushmeier, “An Automatic Word-spotting Framework for Medieval Manuscripts”, 2nd Digital Heritage International Congress, pp. 5-12, Granada, Spain, 2015
    10. D. Brodić, M. Jevtić, Z.N. Milivojević and V. Tasić, “Text Skew Estimation Based on the Horizontal Entropy Calculation”, International Convention on Information and Communication Technology, Electronics and Microelectronics, Adriatic Coast, Croatia, 2015
    11. R. Pintus, Y. Yang, H. Rushmeier, “ATHENA: Automatic text height extraction for the analysis of text lines in old handwritten manuscripts”, Journal of Computing and Cultural Heritage, vol. 8, no. 1, 2015
    12. R. Pintus, Y. Yang, E. Gobbetti and H. Rushmeier, "A TaLISMAN: Automatic Text and LIne Segmentation of historical MANuscripts", 12th Eurographics Worhshop on Graphics and Cultural Heritage, Darmstadt, Russia, 2014
    13. J. Fabrizio, "A precise skew estimation algorithm for document images using KNN clustering and fourier transform", International Conference on Image Processing (ICIP'14), pp. 2585-2588, 2014

  • G. Louloudis, B. Gatos, N. Stamatopoulos and A. Papandreou, “ICDAR2013 Writer Identification Contest”, 12th International Conference on Document Analysis and Recognition (ICDAR'13), pp. 1397-1401, Washington DC, USA, August 2013.Download Paper

  • Writer identification is important for forensic analysis, helping experts to deliberate on the authenticity of documents. The ICDAR2013 Competition on Writer Identification is part of a competition series (see also ICDAR2011 and ICFHR2012 Writer Identification Contests) which is dedicated to record recent advances in the field of writer identification for Latin scripts using established evaluation performance measures. The benchmarking dataset was created with the help of 250 writers that were asked to copy four parts of text in two Latin based languages (English and Greek). This paper describes the contest details of this competition including the evaluation measures used as well as the performance of the 12 submitted methods by 6 different groups along with a short description of each method.
    1. M.L. Bouibed, H. Nemmour and Y. Chibani, “Multiple writer retrieval systems based on language independent dissimilarity learning”, Expert Systems with Applications, DOI: https://doi.org/10.1016/j.eswa.2019.113023, 2019
    2. A. Nicolaou, S. Dey, V. Christlein, A. Maier amd D. Karatzas, “Non-deterministic Behavior of Ranking-Based Metrics When Evaluating Embeddings”, 2nd International Workshop on Reproducible Research in Pattern Recognition (RRPR'18), pp. 71-82, 2019
    3. B. Riyadh, V. Eglin and C. Largeron, “Extraction of musical motifs from handwritten music score images”, 14th International Conference on Computer Vision Theory and Applications (VISAPP'19), pp. 428-435, 2019
    4. S. Chen, Y. Wang, C.T. Lin, W. Ding and Z. Cao, “Semi-supervised feature learning for improving writer identification”, Information Sciences, vol. 482, pp. 156-170, 2019
    5. M. Keglevic, S. Fiel and R. Sablatnig, “Learning Features for Writer Retrieval and Identification using Triplet CNNs”, 16th International Conference on Frontiers in Handwriting Recognition (ICFHR'18), pp. 211-216, 2018
    6. G. Abdeljalil, I. Siddiqi, C. Djeddi and S. Al-Maadeed, “Writer Identification on Historical Documents Using Oriented Basic Image Features”, 16th International Conference on Frontiers in Handwriting Recognition (ICFHR'18), pp. 369-373, 2018
    7. F. Wahlberg, “Gaussian process classification as metric learning for forensic writer identification”, 13th IAPR International Workshop on Document Analysis Systems (DAS'18), pp. 175-180, 2018
    8. V. Christlein and A. Maier, “Encoding CNN activations for writer recognition”, 13th IAPR International Workshop on Document Analysis Systems (DAS'18), pp. 169-174, 2018
    9. K. Ni, P. Callier, B. Hatch, J. Mastarone and J. Cline, "On noise reduction for handwritten writer identification", 51st Asilomar Conference on Signals, Systems and Computers (ACSSC'17), pp. 1984-1988, 2017
    10. H. Mohammed, V. Maergner, T. Konidaris and H.S. Stiehl, “Normalised Local Naïve Bayes Nearest-Neighbour Classifier for Offline Writer Identification”, 14th IAPR International Conference on Document Analysis and Recognition (ICDAR'17), pp. 1013-1018, 2017
    11. G.J. Tan, G. Sulong and M.S.M. Rahim, "Writer Identification: A comparative study across three world major language" Forensic Science International, https://doi.org/10.1016/j.forsciint.2017.07.034, 2017
    12. K. Ni, P. Callier and B. Hatch, "Writer Identification in Noisy Handwritten Documents", Winter Conference on Applications of Computer Vision (WACV), Santa Rosa, USA, pp. 1177-1186, 2017
    13. V. Christlein, D. Bernecker, F. Hönig, A. Maier and E. Angelopoulou, “Writer Identification Using GMM Supervectors and Exemplar-SVMs”, Pattern Recognition, doi.org/10.1016/j.patcog.2016.10.005, 2016
    14. S. He and L. Schomaker, “Beyond OCR: Multi-faceted understanding of handwritten document characteristics”, Pattern Recognition, DOI: 10.1016/j.patcog.2016.09.017, 2016
    15. D. Siegmund, T. Ebert and N. Damer, “Combining Low-Level Features of Offline Questionnaires for Handwriting Identification”, 13th International Conference on Image Analysis and Recognition (ICIAP'16), pp. 46-54, Portuga, 2016
    16. S. He and L. Schomaker, “General Pattern Run-Length Transform for Writer Identification”, 12th Workshop on Document Analysis Systems (DAS'16), pp. 60-65, Santorini, Greece, 2016
    17. Y. Tang and X. Wu, “Text-independent Writer Identification via CNN Features and Joint Bayesian”, 15th International Conference on Frontiers in Handwriting Recognition (ICFHR'16), pp. 566-571, 2016
    18. S. He and L. Schomaker, “Co-occurrence features for writer identification”, 15th International Conference on Frontiers in Handwriting Recognition (ICFHR'16), pp. 83-78, 2016
    19. A. Parziale, A. Santoro and A. Marcelli, “Writer verification in forensic handwriting examination: a pilot study”, 15th International Conference on Frontiers in Handwriting Recognition (ICFHR'16), pp. 447-452, 2016
    20. V. Christlein, D. Bernecker, A. Maier and E. Angelopoulou, “Offline Writer Identification Using Convolutional Neural Network Activation Features”, 37th German Conference Pattern Recognition (GCPR '15), Volume 9358 2015
    21. S. Fiel and R. Sablatnig, “Writer Identification and Retrieval Using a Convolutional Neural Network”, 16th International Conference on Computer Analysis of Images and Patterns (CAIP'15), pp. 26-37, Malta, 2015
    22. C. Djeddi, S. Al-Maadeed, A. Gattal, I. Siddiqi, L. Souici-Meslati and H.E. Abed, “ICDAR2015 Competition on Multi-script Writer Identification and Gender Classification using ‘QUWI’ Database”, 13th International Conference on Document Analysis and Recognition (ICDAR'15), pp. 1191-1195, Nancy, France, 2015
    23. V. Christlein, D. Bernecker and E. Angelopoulou, “Writer Identification Using VLAD Encoded Contour-Zernike Moments”, 13th International Conference on Document Analysis and Recognition (ICDAR'15), pp. 906-910, Nancy, France, 2015
    24. A. Nicolaou, A.D. Bagdanov, M. Liwicki and D. Karatzas, “Sparse Radial Sampling LBP for Writer Identification”, 13th International Conference on Document Analysis and Recognition (ICDAR'15), pp. 716-720, Nancy, France, 2015
    25. Y.J. Xiong, Y. Wen, P.S.P Wang and Y. Lu, “Text-independent Writer Identification Using SIFT Descriptor and Contour-directional Feature”, 13th International Conference on Document Analysis and Recognition (ICDAR'15), pp. 91-95, Nancy, France, 2015
    26. C. Adak and B.B. Chaudhuri, “Writer Identification from offline isolated Bangla characters and numerals”, 13th International Conference on Document Analysis and Recognition (ICDAR'15), pp. 486-490, Nancy, France, 2015
    27. A. Marcelli, A. Parziale and C.D. Stefano, "Quantitative Evaluation of Features for Forensic Handwriting Examination", 4th International Workshop on Automated Forensic Handwriting Analysis (AFHA’15), pp. 1266-1271, Nancy, France, 2015
    28. F. Wahlberg L. Mårtensson and A. Brun, "Large scale style based dating of medieval manuscripts", 3rd International Workshop on Historical Document Imaging and Processing (HIP’15), pp. 107-114, Nancy, France, 2015
    29. A. Garz, M. Wursch and R. Ingold, "Training-and Segmentation-Free Intuitive Writer Identification with Task-Adapted Interest Points", 17th Conference of the International Graphonomics Society (IGS'15), 2015
    30. S. He and L. Schomaker, "Delta-n Hinge: Rotation-invariant features for writer identification", International Conference on Pattern Recognition, art. no. 6977065, pp. 2023-2028. 2014
    31. R. Jain and D. Doermann, "Combining Local Features For Offline Writer Identification", 4th International Conference on Frontiers in Handwriting Recognition (ICFHR'14), pp. 583-588, Creta, Grecce, September 2014
    32. F. Wahlberg, L. Mårtensson and A. Brun, "Scribal Attribution using a Novel 3-D Quill-Curvature Feature Histogram", 4th International Conference on Frontiers in Handwriting Recognition (ICFHR'14), pp. 732-737, Creta, Grecce, September 2014
    33. V. Christlein, D. Bernecker, F. Honig and E. Angelopoulou, “Writer identification and verification using GMM supervectors”, 2014 IEEE Winter Conference on Applications of Computer Vision (WACV'14), Steamboat Springs, USA, pp. 998-1005, 2014

  • G. Louloudis, B. Gatos and N. Stamatopoulos, “ICFHR2012 Competition on Writer Identification - Challenge 1: Latin/Greek Documents”, 13th International Conference on Frontiers in Handwriting Recognition (ICFHR'12), pp. 825-830, Bari, Italy, September 2012.Download Paper

  • Writer identification is important for forensic analysis, helping experts to deliberate on the authenticity of documents. The general objective of the ICFHR 2012 writer identification contest was to record recent advances in the field of writer identification using established evaluation performance measures. Challenge 1 of the contest dealt specifically with Latin scripts. The benchmarking dataset of challenge 1 of the contest was created with the help of 100 writers that were asked to copy four parts of text in two languages (English and Greek). This paper describes the contest details for this challenge including the evaluation measures used as well as the performance of the 7 submitted methods along with a short description of each method.
    1. W. Bouamra, C. Djeddi, B. Nini, M. Diaz and I. Siddiqi, “Towards the design of an offline signature verifier based on a small number of genuine samples for training”, Expert Systems with Applications, vol. 107, pp. 182-1956, 2018
    2. H. Mohammed, V. Maergner, T. Konidaris and H.S. Stiehl, “Normalised Local Naïve Bayes Nearest-Neighbour Classifier for Offline Writer Identification”, 14th IAPR International Conference on Document Analysis and Recognition (ICDAR'17), pp. 1013-1018, 2017
    3. G.J. Tan, G. Sulong and M.S.M. Rahim, "Writer Identification: A comparative study across three world major language" Forensic Science International, https://doi.org/10.1016/j.forsciint.2017.07.034, 2017
    4. B. Sober and D. Levin, “Computer aided restoration of handwritten character strokes”, CAD Computer Aided Design, vol. 89, pp. 12-24, 2017
    5. Y. Tang and X. Wu, “Text-independent Writer Identification via CNN Features and Joint Bayesian”, 15th International Conference on Frontiers in Handwriting Recognition (ICFHR'16), pp. 566-571, 2016
    6. V. Christlein, D. Bernecker, F. Hönig, A. Maier and E. Angelopoulou, “Writer Identification Using GMM Supervectors and Exemplar-SVMs”, Pattern Recognition, doi.org/10.1016/j.patcog.2016.10.005, 2016
    7. A. Inamdar, “Offline Text-Independent Writer Identification”, International Journal of Engineering Applied Sciences and Technology ,vol. 1, no. 9, pp. 90-94, 2016
    8. C. Djeddi, I. Siddiqi, S. Al-Maadeed, L. Souici-Meslati, A. Gattal and A. Ennaji, “Signature Verification for Offline Skilled Forgeries Using Textural Features”, 11th International Conference on Signal-Image Technology and Internet-Based Systems (SITIS'15), pp. 76-80, Bangkok, 2015
    9. A. Nicolaou, A.D. Bagdanov, M. Liwicki and D. Karatzas, “Sparse Radial Sampling LBP for Writer Identification”, 13th International Conference on Document Analysis and Recognition (ICDAR'15), pp. 716-720, Nancy, France, 2015
    10. Y.J. Xiong, Y. Wen, P.S.P Wang and Y. Lu, “Text-independent Writer Identification Using SIFT Descriptor and Contour-directional Feature”, 13th International Conference on Document Analysis and Recognition (ICDAR'15), pp. 91-95, Nancy, France, 2015
    11. M.K. Sharma and V.P. Dhaka, "Offline scripting-free author identification based on speeded-up robust features", International Journal on Document Analysis and Recognition (IJDAR), vol 18, no. 4, pp. 303-316, 2015
    12. K. Gayathri and J. Bhuvana, "Optimization of Signature Recognition in IAM Dataset", International Journal of Innovative Research in Engineering Science and Technology (IJIREST), vol 3, no. 2, pp. 89-93, 201
    13. Y. Tang, W. Bu, X. Wu, "Text-independent writer identification using improved structural features", 9th Chinese Conference on Biometric Recognition (CCBR'14), pp. 404-411, Shenyang, China, 2014
    14. F. Slimane, S. Awaida, A. Mezghani, M.T. Parvez, S. Kanoun, S.A. Mahmoud and V. Märgner, "ICFHR2014 Competition on Arabic Writer Identification Using AHTID/MW and KHATT Databases", 4th International Conference on Frontiers in Handwriting Recognition (ICFHR'14), pp. 797-802, Creta, Grecce, September 2014
    15. F. Wahlberg, L. Mårtensson and A. Brun, "Scribal Attribution using a Novel 3-D Quill-Curvature Feature Histogram", 4th International Conference on Frontiers in Handwriting Recognition (ICFHR'14), pp. 732-737, Creta, Grecce, September 2014
    16. C. Djeddi, L.S. Meslati, I. Siddiqi, A. Ennaji, H.E. Abeda and A. Gattal, “Evaluation of Texture Features for Offline Arabic Writer Identification”, 11th IAPR International Workshop on Document Analysis Systems (DAS'14), Tours, France, pp. 106-110, 2014
    17. X. Wu, Y. Tang and W. Bu, "Offline Text-independent Writer Identification Based on Scale Invariant Feature Transform", IEEE Transactions on Information Forensics and Security, vol. 9, no. 3, pp. 526-536, 2014
    18. F. Kleber, S. Fiel, M. Diem and R. Sablatnig, “CVL-Database: An Off-line Database for Writer Retrieval, Writer Identification and Word Spotting”, 12th International Conference on Document Analysis and Recognition (ICDAR'13), pp. 560-564, Washington DC, USA, August 2013
    19. S. Fiel and R. Sablatnig, “Writer Identification and Writer Retrieval using the Fisher Vector on Visual Vocabularies”, 12th International Conference on Document Analysis and Recognition (ICDAR'13), pp. 545-549, Washington DC, USA, August 2013
    20. A. Nicolaou, M. Liwicki and R. Ingolf, “Oriented Local Binary Patterns for Writer Identification”, 2nd International Workshop and Tutorial on Automated Forensic Handwriting Analysis (AFHA'13), Washington DC, USA, August 2013
    21. C. Djeddi, I. Siddiqi, L. Souici-Meslati and A. Ennaji, “Text-Independent Writer Recognition Using Multi-script Handwritten Texts”, Pattern Recognition Letters, vol. 34, no. 10, pp. 1196-1202, 2013

  • B. Gatos , G. Louloudis and N. Stamatopoulos, “Greek Polytonic OCR based on Efficient Character Class Number Reduction”, 11th International Conference on Document Analysis and Recognition (ICDAR'11), pp. 1155-1159, Beijing, China, September 2011.Download Paper

  • Recognition of document images having Greek polytonic (multi accent) characters is a challenging task due the large number of existing character classes (more than 270). In this paper, we propose a novel OCR framework for the recognition of machine-printed Greek polytonic documents that is based on combining five different recognition modules in order to have a small number of classes (around 30) in each module. One recognition module is used for accent recognition while four recognition modules are used for the recognition of characters belonging to different horizontal text zones. The proposed system also includes the following stages: a) preprocessing, b) text dewarping, text line and text baseline detection, c) accent and character detection and d) combination of accent and character recognition results. Extended experiments have been conducted in order to record the performance of the proposed OCR system, of all involved recognition modules as well as of the accent detection stage.
    1. B. Robertson and F. Boschetti, "Large-Scale Optical Character Recognition of Ancient Greek", Mouseion, vol. 14, no. 3, pp. 341–359, 2017

  • G. Louloudis, N. Stamatopoulos and B. Gatos, “ICDAR 2011 - Writer Identification Contest”, 11th International Conference on Document Analysis and Recognition (ICDAR'11), pp. 1475-1479, Beijing, China, September 2011.Download Paper

  • ICDAR 2011 Writer Identification Contest is the first contest which is dedicated to record recent advances in the field of writer identification using established evaluation performance measures. The benchmarking dataset of the contest was created with the help of 26 writers that were asked to copy eight pages that contain text in several languages (English, French, German and Greek). This paper describes the contest details including the evaluation measures used as well as the performance of the 8 submitted methods along with a short description of each method.
    1. M.L. Bouibed, H. Nemmour and Y. Chibani, “Multiple writer retrieval systems based on language independent dissimilarity learning”, Expert Systems with Applications, DOI: https://doi.org/10.1016/j.eswa.2019.113023, 2019
    2. A. Nicolaou, S. Dey, V. Christlein, A. Maier amd D. Karatzas, “Non-deterministic Behavior of Ranking-Based Metrics When Evaluating Embeddings”, 2nd International Workshop on Reproducible Research in Pattern Recognition (RRPR'18), pp. 71-82, 2019
    3. A. Bennour, C. Djeddi, A. Gattal, I. Siddiqi and T. Mekhaznia, “Handwriting Based Writer Recognition Using Implicit Shape Codebook”, Forensic Science International, vol. 301, pp. 91-100, 2019
    4. A. Chahi, Y. El Merabet, Y. Ruichek and R. Touahni, “Off-line Text-independent Writer Identification Using Local Convex Micro-Structure Patterns”, Second conference of The Moroccan Classification Society (SMC'18), 2019
    5. S. Chen, Y. Wang, C.T. Lin, W. Ding and Z. Cao, “Semi-supervised feature learning for improving writer identification”, Information Sciences, vol. 482, pp. 156-170, 2019
    6. M.L Bouibed, H. Nemmour, and Y. Chibani, “Evaluation of gradient descriptors and dissimilarity learning for writer retrieval”, 8th International Conference on Information Science and Technology (ICIST'18), pp. 252-256, 2018
    7. F. Khan, F. Khelifi, M. Tahir and A. Bouridane, “Dissimilarity Gaussian Mixture Models for Efficient Offline Handwritten Text-Independent Identification using SIFT and RootSIFT Descriptors”, IEEE Transactions on Information Forensics and Security, 2018
    8. W. Bouamra, C. Djeddi, B. Nini, M. Diaz and I. Siddiqi, “Towards the design of an offline signature verifier based on a small number of genuine samples for training”, Expert Systems with Applications, vol. 107, pp. 182-1956, 2018
    9. H. Mohammed, V. Maergner, T. Konidaris and H.S. Stiehl, “Normalised Local Naïve Bayes Nearest-Neighbour Classifier for Offline Writer Identification”, 14th IAPR International Conference on Document Analysis and Recognition (ICDAR'17), pp. 1013-1018, 2017
    10. A.A. Ahmed, H.R. Hasan, F.A. Hameed and O.I. Al-Sanjary, "Writer Identification on Multi-Script Handwritten Using Optimum Features", Kurdistan Journal of Applied Research - KJAR, vol. 2, no. 3, 2017
    11. G.J. Tan, G. Sulong and M.S.M. Rahim, "Writer Identification: A comparative study across three world major language" Forensic Science International, https://doi.org/10.1016/j.forsciint.2017.07.034, 2017
    12. K. Ni, P. Callier and B. Hatch, "Writer Identification in Noisy Handwritten Documents" Winter Conference on Applications of Computer Vision (WACV), Santa Rosa, USA, pp. 1177-1186, 2017
    13. C. Adak, B.B. Chaudhuri and M. Blumenstein, “Writer identification by training on one script but testing on another”, 23rd International Conference on Pattern Recognition (ICPR'16), pp. 1153-1158, 2016
    14. Y. Tang and X. Wu, “Text-independent Writer Identification via CNN Features and Joint Bayesian”, 15th International Conference on Frontiers in Handwriting Recognition (ICFHR'16), pp. 566-571, 2016
    15. V. Christlein, D. Bernecker, F. Hönig, A. Maier and E. Angelopoulou, “Writer Identification Using GMM Supervectors and Exemplar-SVMs”, Pattern Recognition, doi.org/10.1016/j.patcog.2016.10.005, 2016
    16. A. Nicolaou, A.D. Bagdanov, L. Gomez-Bigorda and D. Karatzas, “Visual Script and Language Identification”, 12th Workshop on Document Analysis Systems (DAS'16), pp. 393-398, Santorini, Greece, 2016
    17. A. Inamdar, “Offline Text-Independent Writer Identification”, International Journal of Engineering Applied Sciences and Technology, vol. 1, no. 9, pp. 90-94, 2016
    18. C. Djeddi, I. Siddiqi, S. Al-Maadeed, L. Souici-Meslati, A. Gattal and A. Ennaji, “Signature Verification for Offline Skilled Forgeries Using Textural Features”, 11th International Conference on Signal-Image Technology and Internet-Based Systems (SITIS'15), pp. 76-80, Bangkok, 2015
    19. S. Fiel and R. Sablatnig, “Writer Identification and Retrieval Using a Convolutional Neural Network”, 16th International Conference on Computer Analysis of Images and Patterns (CAIP'15), pp. 26-37, Malta, 2015
    20. M.K. Sharma and V.P. Dhaka, "Offline scripting-free author identification based on speeded-up robust features", International Journal on Document Analysis and Recognition (IJDAR), vol 18, no. 4, pp. 303-316, 2015
    21. K. Gayathri and J. Bhuvana, "Optimization of Signature Recognition in IAM Dataset", International Journal of Innovative Research in Engineering Science and Technology (IJIREST), vol 3, no. 2, pp. 89-93, 201
    22. A. Garz, M. Wursch and R. Ingold, "Training-and Segmentation-Free Intuitive Writer Identification with Task-Adapted Interest Points", 17th Conference of the International Graphonomics Society (IGS'15), 201
    23. S. Al-Maadeed, A. Hassaine and A. Bouridan, “Using codebooks generated from text skeletonization for forensic writer identification”, 11th IEEE/ACS International Conference on Computer Systems and Applications, (AICCSA'14), pp. 729-733, Doha, Qatar, November 2014
    24. Y. Tang, W. Bu, X. Wu, "Text-independent writer identification using improved structural features", 9th Chinese Conference on Biometric Recognition (CCBR'14), pp. 404-411, Shenyang, China, 2014
    25. C. Djeddi, L.S. Meslati, I. Siddiqi, A. Ennaji, H.E. Abeda and A. Gattal, “Evaluation of Texture Features for Offline Arabic Writer Identification”, 11th IAPR International Workshop on Document Analysis Systems (DAS'14), Tours, France, pp. 106-110, 2014
    26. M.R. Welekar and M.V.S.D. Rao, “Survey on Existing Techniques for Writer Verification”, International journal of advanced computer technology (COMPUSOFT), vol. 3, no. 5, pp. 773-776, 2014
    27. S. Fiel, F. Hollaus, M. Gau and R. Sablatnig, “Writer identification on historical Glagolitic documents”, Proceedings of SPIE - The International Society for Optical Engineering, vol. 9021, article number 902102, Document Recognition and Retrieval XXI, San Francisco, United States, 2014
    28. H. Ding, H. Wu, X. Zhang and JP. Chen, "Writer Identification Based on Local Contour Distribution Feature", International Journal of Signal Processing, Image Processing and Pattern Recognition, vol. 7, no.1, pp. 169-180, 2014
    29. X. Wu, Y. Tang and W. Bu, "Offline Text-independent Writer Identification Based on Scale Invariant Feature Transform", IEEE Transactions on Information Forensics and Security, vol. 9, no. 3, pp. 526-536, 2014
    30. A.J. Newell and L.D. Griffin, "Writer identification using oriented basic image features and the delta encoding", Pattern Recognition, vol. 47, no. 6, pp. 2255-2265, 2013
    31. J. Chen and D. Lopresti, “Alternatives for Page Skew Compensation in Writer Identification”, 12th International Conference on Document Analysis and Recognition (ICDAR'13), pp. 927-931, Washington DC, USA, August 2013
    32. F. Kleber, S. Fiel, M. Diem and R. Sablatnig, “CVL-Database: An Off-line Database for Writer Retrieval, Writer Identification and Word Spotting”, 12th International Conference on Document Analysis and Recognition (ICDAR'13), pp. 560-564, Washington DC, USA, August 2013
    33. S. Fiel and R. Sablatnig, “Writer Identification and Writer Retrieval using the Fisher Vector on Visual Vocabularies”, 12th International Conference on Document Analysis and Recognition (ICDAR'13), pp. 545-549, Washington DC, USA, August 2013
    34. Z.A. Daniels and H.S. Baird, “Discriminating Features for Writer Identification”, 12th International Conference on Document Analysis and Recognition (ICDAR'13), pp. 1417-1421, Washington DC, USA, August 2013
    35. A. Nicolaou, M. Liwicki and R. Ingolf, “Oriented Local Binary Patterns for Writer Identification”, 2nd International Workshop and Tutorial on Automated Forensic Handwriting Analysis (AFHA'13), Washington DC, USA, August 2013
    36. C. Djeddi, I. Siddiqi, L. Souici-Meslati and A. Ennaji, “Text-Independent Writer Recognition Using Multi-script Handwritten Texts”, Pattern Recognition Letters, vol. 34, no. 10, pp. 1196-1202, 2013
    37. D. Hong, F.Y. Yang and X.F. Zhang, "Local fragment distribution features for text-independent writer identification", BioTechnology: An Indian Journal, vol.8, no. 6, pp. 855-860, 2013
    38. S. Al-Maadeed, W. Ayouby, A. Hassaine and J. Alja’am, “QUWI: An Arabic and English Handwriting Dataset for Offline Writer Identification”, 13th International Conference on Frontiers in Handwriting Recognition (ICFHR'12), pp. 742-747, Bari, Italy, September 2012
    39. A. Hassaine and S. Al-Maadeed, “ICFHR2012 competition on writer identification - Challenge 2: Arabic scripts”, 13th International Conference on Frontiers in Handwriting Recognition (ICFHR'12), pp. 835-840, Bari, Italy, September 2012
    40. C. Djeddi, I. Siddiqi, L. Souici-Meslati and A. Ennaji, “Multi-script Writer Identification Optimized With Retrieval Mechanism”, 13th International Conference on Frontiers in Handwriting Recognition (ICFHR'12), pp. 507-512, Bari, Italy, September 2012
    41. C. Djeddi, L. Souici-Meslati and A. Ennaji, “Writer recognition on arabic handwritten documents”, 5th International Conference on Image and Signal Processing, (ICISP'12), pp. 493-501, Agadir, Morocco, 2012

  • N. Stamatopoulos, G. Louloudis and B. Gatos, “Efficient Transcript Mapping to Ease the Creation of Document Image Segmentation Ground Truth with Text-Image Alignment”, 12th International Conference on Frontiers in Handwriting Recognition (ICFHR'10), pp. 226-231, Kolkata, India, November 2010.Download Paper

  • One of the major issues in document image processing is the efficient creation of ground truth in order to be used for training and evaluation purposes. Since a large number of tools have to be trained and evaluated in realistic circumstances, we need to have a quick and low cost way to create the corresponding ground truth. Moreover, the specific need for having the correct text correlated with the corresponding image area in text line and word level makes the process of ground truth creation a difficult, tedious and costly task. In this paper, we introduce an efficient transcript mapping technique to ease the construction of document image segmentation ground truth that includes text-image alignment. The proposed text line transcript mapping technique is based on Hough transform that is guided by the number of the text lines. Concerning the word segmentation ground truth, a gap classification technique constrained by the number of the words is used. Experimental results prove that using the proposed technique for handwritten documents, the percentage of time saved for ground truth creation and text-image alignment is more than 90%.
    1. A. Vij and J. Pruthi, “An automated Psychometric Analyzer based on Sentiment Analysis and Emotion Recognition for healthcare”, International Conference on Computational Intelligence and Data Science (ICCIDS'18), pp. 1184-1191, 2018
    2. A. Vij and J. Pruthi, “An automated Psychometric Analyzer based on Sentiment Analysis and Emotion Recognition for healthcare”, Procedia Computer Science, vol. 132, pp. 1184-1191, 2018
    3. M. Kassis, J. Nassour and J. El-Sana, “Alignment of Historical Handwritten Manuscripts Using Siamese Neural Network”, 14th IAPR International Conference on Document Analysis and Recognition (ICDAR'17), pp. 293-298, 2017
    4. M. Seuret, R. Ingold, and M. Liwicki, “N-light-N: A Highly-Adaptable Java Library for Document Analysis with Convolutional Auto-Encoders and Related Architectures”, 15th International Conference on Frontiers in Handwriting Recognition (ICFHR'16), pp. 459-464, 2016
    5. G. Sadeh, L. Wolf, T. Hassner, N. Dershowitz and D.S. Ben-Ezra, “Viral Transcript Alignment”, 13th International Conference on Document Analysis and Recognition (ICDAR'15), pp. 711-715, Nancy, France, 2015
    6. W. Swaileh, K.A. Mohand and T. Paquet, "Multi-script Iterative Steerable Directional Filtering For Handwritten Text Line Extraction", 5th International Workshop on Multilingual OCR (MOCR'15), pp. 1241-1245, Nancy, France, 2015
    7. Y. Leydiew, V. Églin, S. Bres and D. Stutzmann, "Learning-free text-image alignment for medieval manuscripts", 4th International Conference on Frontiers in Handwriting Recognition (ICFHR'14), pp. 363-368, Creta, Grecce, September 2014
    8. F. Yin, Q-F. Wang and C-L. Liu, “Transcript mapping for handwritten chinese documents by integrating character recognition model and geometric context”, Pattern Recognition, vol. 46, no. 10, pp. 2807-2818, 2013
    9. X.D. Zhou, F. Yin, D.H. Wang, Q.F. Wang, M. Nakagawa and C.L. Liu, “Transcript Mapping for Handwritten Text Lines Using Conditional Random Fields”, 11th International Conference on Document Analysis and Recognition (ICDAR'11), pp. 58-62, Beijing, China, September 2011
    10. S. Vajda, A. Junaidi and G.A. Fink, “A Semi-Supervised Ensemble Learning Approach for Character Labeling with Minimal Human Effort”, 11th International Conference on Document Analysis and Recognition (ICDAR'11), pp. 259-263, Beijing, China, September 2011
    11. A. Fischer, V. Frinken, A. Fornés and H. Bunke, “Transcription Alignment of Latin Manuscripts Using Hidden Markov Models”, 1st Workshop on Historical Document Imaging and Processing (HIP'11), pp. 29-36, Beijing, China, September 2011
    12. A. Junaidi, S. Vajda and G.A. Fink, “Lampung - a new handwritten character benchmark: database, labeling and recognition”, Joint Workshop on Multilingual OCR and Analytics for Noisy Unstructured Text Data (MOCR_AND '11), Beijing, China, September 2011

  • B. Gatos, N. Stamatopoulos and G. Louloudis, “ICFHR 2010 Handwriting Segmentation Contest”, 12th International Conference on Frontiers in Handwriting Recognition (ICFHR'10), pp. 737-742, Kolkata, India, November 2010.Download Paper

  • The general objective of the ICFHR 2010 Handwriting Segmentation Contest organized in the context of ICFHR 2010 conference was to use well established evaluation practices and procedures in order to record recent advances in off-line handwriting segmentation. Two new benchmarking datasets, one for text line and one for word segmentation, were created in order to test and compare recent algorithms for handwritten document segmentation in realistic circumstances. Handwritten document images were produced by many writers in several languages (English, French, German and Greek). The dataset of previously organized contest (ICDAR 2009 Handwriting Segmentation Contest) was used as training dataset. This paper describes the contest details including the datasets, the ground truth, the evaluation criteria as well as the performance of the 7 submitted methods along with a short description of each method.
    1. G.M. Binmakhashen and S.A. Mahmoud, “Document Layout Analysis: A Comprehensive Survey”, ACM Computing Surveys (CSUR), vol 52, no. 6, 2019
    2. M. Pastor, "Text baseline detection, a single page trained system", Pattern Recognition, vol. 94, pp. 149-161, 2019
    3. T. Gruning, R. Labahn, M. Diem, F. Kleber and S. Fiel, “READ-BAD: A new dataset and evaluation scheme for baseline detection in archival documents”, 13th IAPR International Workshop on Document Analysis Systems (DAS'18), pp. 351-356, 2018
    4. T. Gruuening, G. Leifert, T. Strauss and R. Labahn, “A Robust and Binarization-Free Approach for Text Line Detection in Historical Documents”, 14th IAPR International Conference on Document Analysis and Recognition (ICDAR'17), pp. 236-241, 2017
    5. H. Jain and A.P. Kumar, "A Bottom Up Procedure for Text Line Segmentation of Latin Script", International Conference on Advances in Computing, Communications and Informatics (ICACCI'17), 2017
    6. P. Sahare and S.B. Dhok, "Review of Text Extraction Algorithms for Scene-text and Document Images", IETE Technical Review, vol 34, no. 2, pp. 144-164, 2017
    7. Y. Boulid, A. Souhar and M.Y. Elkettani, “Arabic handwritten text line extraction using connected component analysis from a multi agent perspective”, International Conference on Intelligent Systems Design and Applications (ISDA'16), pp. 80-87, 2016
    8. P. Choudhary and N. Nain, “A Four-Tier Annotated Urdu Handwritten Text Image Dataset for Multidisciplinary Research on Urdu Script”, ACM Transactions on Asian and Low-Resource Language Information Processing, vol. 15, no. 4, article no. 26, 2016
    9. Y. Boulid, A. Souhar and M.Y. Elkettani, “Segmentation approach of Arabic manuscripts text lines based on multi agent systems”, International Journal of Computer Information Systems and Industrial Management Applications, vol. 8, no. 1, pp. 173-183, 2016
    10. Y. Boulid, A. Souhar and M.Y. Elkettani, “Detection of Text Lines of Handwritten Arabic Manuscripts using Markov Decision Processes”, International Journal of Interactive Multimedia and Artificial Inteligence, vol. 4, no. 1, pp. 31-36, 2016
    11. W. Swaileh, K.A. Mohand and T. Paquet, "Multi-script Iterative Steerable Directional Filtering For Handwritten Text Line Extraction", 5th International Workshop on Multilingual OCR (MOCR'15), pp. 1241-1245, Nancy, France, 2015
    12. R. Cohen, I. Dinstein, J. El-Sana and K. Kedem, “Using Scale-Space Anisotropic Smoothing for Text Line Extraction in Historical Documents”, 11th International Conference on Image Analysis and Recognition (ICIAR'14), pp. 349-358, 2014
    13. S. Al-Maadeed, A. Hassaine and A. Bouridan, “Using codebooks generated from text skeletonization for forensic writer identification”, 11th IEEE/ACS International Conference on Computer Systems and Applications, (AICCSA'14), pp. 729-733, Doha, Qatar, November 2014
    14. Y. Elarian, A. Zidouri and W. Al-Khatib, "Ground-truth and Metric for the Evaluation of Arabic Handwritten Character Segmentation", 4th International Conference on Frontiers in Handwriting Recognition (ICFHR'14), pp. 766-770, Creta, Grecce, September 2014
    15. Y. Tang, X. Wu, and W. Bu, “Text Line Segmentation Based on Matched Filtering and Top-Down Grouping for Handwritten Documents”, 11th IAPR International Workshop on Document Analysis Systems (DAS'14), Tours, France, pp. 365-369, 2014
    16. A. Lemaitre, J. Camillerapp and B. Coüasnon, “Handwritten text segmentation using blurred image”, Proceedings of SPIE - The International Society for Optical Engineering, vol. 9021, article number 90210D, Document Recognition and Retrieval XXI, San Francisco, United States, 2014
    17. M. Diem, F. Kleber, S. Fiel, and R. Sablatnig, “Semi-automated document image clustering and retrieval”, Proceedings of SPIE - The International Society for Optical Engineering, vol. 9021, article number 90210M, Document Recognition and Retrieval XXI, San Francisco, United States, 2014
    18. M. Diem, F. Kleber and R. Sablatnig, “Text Line Detection for Heterogeneous Documents”, 12th International Conference on Document Analysis and Recognition (ICDAR'13), pp. 743-747, Washington DC, USA, August 2013
    19. A. Fischer, V. Frinken and H. Bunke, “Hidden markov models for off-line cursive handwriting recognition”, Handbook of Statistics, vol. 31, pp. 421-442, 2013
    20. M. Haji, K.A. Sahoo, T.D. Bui, C.Y. Suen and D. Ponson, “Statistical hypothesis testing for handwritten word segmentation algorithms”, 13th International Conference on Frontiers in Handwriting Recognition (ICFHR'12), pp. 114-119, Bari, Italy, September 2012
    21. I.B. Messaoud, H. Amiri, H.E. Abed and V. Märgner, “A multilevel text line segmentation framework for handwritten historical documents”, 13th International Conference on Frontiers in Handwriting Recognition (ICFHR'12), pp. 515-520, Bari, Italy, September 2012
    22. F. Wahlberg and Anders Brun, “Graph based line segmentation on cluttered handwritten manuscripts”, 21st International Conference on Pattern Recognition (ICPR 2012), pp. 1570-1573, Tsukuba, Japan, November 2012
    23. F. Simistira, V. Papavassiliou, T. Stafylakis and V. Katsouros, “Enhancing Handwritten Word Segmentation by Employing Local Spatial Features”, 11th International Conference on Document Analysis and Recognition (ICDAR'11), pp. 1314-1318, Beijing, China, September 2011

  • N. Stamatopoulos, B. Gatos and T. Georgiou, “Page Frame Detection for Double Page Document Images”, 9th International Workshop on Document Analysis Systems (DAS'10), pp. 401-408, Boston, MA, USA, June 2010.Download Paper

  • Scanning two book pages at the same time helps to accelerate the scanning process but on the other hand introduces several difficulties if the user needs to have one page per image. A major difficulty is the appearance of noisy black borders around text areas as well as of noisy black stripes between the two pages. In this paper, we propose a novel algorithm for detecting the page frames on double page document images. Our aim is to split the image into the two pages as well as to remove noisy borders. First we apply a pre-processing which includes binarization, noise removal and image smoothing. Then, we detect the vertical zones of the two pages. In this stage, we introduce the vertical white run projections which have been proved efficient for detecting vertical zones of text areas. Finally, the horizontal zones of the two pages are detected based on horizontal white run projections. The experimental results on several double page document images from fifteen different books demonstrate the effectiveness of the proposed technique.
    1. A. Kordecki, “Fast document area detection for scanned images”, Proceedings of SPIE - The International Society for Optical Engineering, 11041, art. no. 1104120., 2019
    2. M.M. Reza, M.A. Rakib, S.S. Bukhari and A. Dengel, “A Robust Page Frame Detection Method for Complex Historical Document Images”, 8th International Conference on Pattern Recognition Applications and Methods. International Conference on Pattern Recognition Applications and Methods (ICPRAM-2019), 2019
    3. C. Tensmeyer, B. Davis C. Wigington, I. Lee I and B. Barrett, “PageNet: Page boundary extraction in historical handwrien documents”, International Workshop on Historical Document Imaging and Processing (HIP'17), pp. 59-64, 2017
    4. T. Mondal, N. Ragot, J.Y. Ramel and U. Pal, “Flexible Sequence Matching Technique:An Effective Learning-free Approach For word-spotting”, Pattern Recognition, DOI: doi:10.1016/j.patcog.2016.05.011, 2016
    5. A. Chakraborty and M. Blumenstein, “Preserving Text Content from Historical Handwritten Documents”, 12th Workshop on Document Analysis Systems (DAS'16), pp. 329-334, Santorini, Greece, 2016
    6. A. Chakraborty and M. Blumenstein, “Marginal Noise Reduction in Historical Handwritten Documents - A Survey”, 12th Workshop on Document Analysis Systems (DAS'16), pp. 323-328, Santorini, Greece, 2016
    7. C. Crovato, D. Torok, R. Heidrich, B. Cerqueira and E. Velho , “Preparing for OCR of Books Handled by Visually Impaired”, 10th International Conference Ubiquitous Computing and Ambient Intelligence (UCAmI'16), pp. 419-430, 2016
    8. M. Wagdy, I. Faye and D. Rohaya, “Border noise removal from the document image using X-Y cut and filtering technique based on morphological operation”, International Journal of Imaging and Robotics, vol.15, no. 3, pp. 88-105, 2015
    9. L.P. Heras, D. Fernandez, A. Fornes, E. Valveny, G. Sanchez and J. Llados, “Perceptual Retrieval of Architectural Floor Plan Images”, 10th IAPR International Workshop on Graphics Recognition, 2013
    10. M. Agrawal, and D. Doermann, “Clutter noise removal in binary document images”, International Journal on Document Analysis and Recognition (IJDAR) vol. 16, no. 4, pp. 351-369, 2013
    11. A. Gordoa, F. Perronninb and E, Valveny, “Large-scale document image retrieval and classification with runlength histograms and binary embeddings”, Pattern Recognition, 2012
    12. S.S. Bukhari, F. Shafait and T.M. Breuel, “Border Noise Removal of Camera-Captured Document Images using Page Frame Detection”, 4th International Workshop on Camera-Based Document Analysis and Recognition (CBDAR'11), Beijing, China, September 2011

  • G. Vamvakas, N. Stamatopoulos, B Gatos and S.J. Perantonis, “Automatic Unsupervised Parameter Selection for Character Segmentation”, 9th International Workshop on Document Analysis Systems (DAS'10), pp. 409-415, Boston, MA, USA, June 2010.Download Paper

  • A major difficulty for designing a document image segmentation methodology is the proper value selection for all involved parameters. This is usually done after experimentations or after involving a training supervised phase which is a tedious process since the corresponding segmentation ground truth has to be created. In this paper, we propose a novel automatic unsupervised parameter selection methodology that can be applied to the character segmentation problem. It is based on clustering of the entities obtained as a result of the segmentation for different values of the parameters involved in the segmentation method. The clustering is performed using features extracted from the segmented entities based on zones and from the area that is formed from the projections of the upper/lower and left/right profiles. Optimization of an appropriate intra-class distance measure yields the optimal parameter vector. The method is evaluated on two segmentation algorithms, namely a recently proposed character segmentation technique based on skeleton segmentation paths, as well as the well known RLSA technique. The proposed parameter selection method is capable of finding the segmentation parameters that correspond to the optimal or near optimal segmentation result, as this is determined by counting the number of matches between the entities detected by the segmentation algorithm and the entities in the ground truth.
    1. R.D. Lins and C. Gomes, “Automatic Training Set Generation for Better Historic Document Transcription and Compression”, 11th IAPR International Workshop on Document Analysis Systems (DAS'14), Tours, France, pp. 277-281, 2014

  • N. Stamatopoulos, B. Gatos and I. Pratikakis, “A Methodology for Document Image Dewarping Techniques Performance Evaluation”, 10th International Conference on Document Analysis and Recognition (ICDAR'09), pp. 956-960, Barcelona, Spain, July 2009.Download Paper

  • One of the major challenges in camera document analysis is to deal with the page curl and perspective distortions. In spite of the prevalence of dewarping techniques, no standard for their performance evaluation method exists with most of the evaluation done to concentrate in visual pleasing impressions. This paper presents an objective evaluation methodology for document image dewarping techniques. First, manually selected sets of points of the initial warped image are matched with the corresponding points of the dewarping result using the Scale Invariant Feature Transform (SIFT). Each set corresponds to a representative text line of the image. Then, based on cubic polynomial curves that fit to the selected text lines, a comprehensive measure which reflects the entire performance of a dewarping technique in a concise quantitative manner is calculated. Experiments applying the proposed performance evaluation methodology on two state of the art dewarping techniques as well as a commercial package are presented.
    1. J. Diaz-Escobar and V. Kober, “Optical character recognition of camera-captured images based on phase features”, Applications of Digital Image Processing XXXVIII, Article number 959903, 2015
    2. S.S. Bukhari and A. Dengel, “Visual Appearance based Document Classification Methods: Performance Evaluation and Benchmarking”, 13th International Conference on Document Analysis and Recognition (ICDAR'15), pp. 981-985, Nancy, France, 2015
    3. J. Diaz-Escobar and V. Kober , “Optical character recognition of camera-captured images based on phase features”, SPIE 9599 Applications of Digital Image Processing XXXVIII, 959903, 2015
    4. T.V. Vidula and V.V. Nair, “A robust performance evaluation scheme for rectification algorithms in camera captured document images”, 1st International Conference on Computational Systems and Communications (ICCSC '14), pp. 162-166, Trivandrum, India, 2014
    5. A. Pugliese and S. Pomes, S. Ferilli and D. Redavid, “A novel model-based dewarping technique for advanced digital library systems”, Italian Research Conference on Digital Libraries (IRCDL'14), pp. 108-115, Padova, Italy, 2014
    6. M. Rahnemoonfar and B. Plale, “Automatic performance evaluation of dewarping methods in large scale digitization of historical documents”, 13th ACM/IEEE-CS Joint Conference on Digital Libraries (JCLD'13), pp. 331-334, Indiana, USA, July 2013
    7. L. Tong, G. Zhan, Q. Peng, Y. Li and Y. Li, “Warped document image correction method based on heterogeneous registration strategies”, 5th International Conference on Machine Vision (ICMV'12), 878308, Wuhan, China, October 2012
    8. S.S. Bukhari, F. Shafait and T.M. Breuel, “An image based performance evaluation method for page dewarping algorithms using SIFT features”, 4th International Workshop on Camera-Based Document Analysis and Recognition (CBDAR'11), pp. 138-149, Beijing, China, September 2011
    9. S. Pletschacher and A. Antonacopoulos, “The PAGE (Page Analysis and Ground-Truth Elements) Format Framework”, 20th International Conference on Pattern Recognition (ICPR'10), pp. 257-260, Istanbul, Turkey, August 2010

  • G. Louloudis, N. Stamatopoulos and B. Gatos, “A Novel Two Stage Evaluation Methodology for Word Segmentation Techniques”, 10th International Conference on Document Analysis and Recognition (ICDAR'09), pp. 686-690, Barcelona, Spain, July 2009.Download Paper

  • Word segmentation is a critical stage towards word and character recognition as well as word spotting and mainly concerns two basic aspects, distance computation and gap classification. In this paper, we propose a robust evaluation methodology that treats the distance computation and the gap classification stages independently. The detection rate calculated for every distance metric corresponds to the maximum detection rate that we could have achieved if we had a perfect classifier for the gap classification stage. The proposed evaluation framework has been applied to several state-of-the-art techniques using a handwritten as well as a historical typewritten document set. The best combination of distance metric computation and gap classification state-of-the-art techniques is proposed.
    1. K. Thangairulappan and K. Mohan, "Efficient segmentation of printed Tamil script into characters using projection and structure ", 4th International Conference on Image Information Processing (ICIIP'17), pp. 484-489, 2017
    2. A. Abliz, W. Simayi, K. Moydin and A. Hamdulla, “Survey on Methods for Basic Unit Segmentation in Off-Line Handwritten Text Recognition”, International Journal of Future Generation Communication and Networking vol. 9, no. 11, pp. 137- 152, 2016
    3. Y. Lin, Y. Li, Y. Song and F. Wang, “Fast document image comparison in multilingual corpus without OCR”, Multimedia Systems, pp. 1-10, DOI: 10.1007/s00530-015-0484-3, 2015
    4. S. Pannirselvam and S. Ponmani, “A Novel Hybrid Model For Tamil Handwritten Character Segmentation”, International Journal of Scientific & Enginee ring Research, vol. 5, no. 11, pp. 271-275, 2014
    5. S. Gomathi, R.U. Devi and S. Mohanavel, “Trimming approach for word segmentation with focus on overlapping characters”, International Conference on Computer Communications and Informatics (ICCCI'13), pp. 1-4, Coimbatore, India, 2013

  • B. Gatos, N. Stamatopoulos and G. Louloudis, “ICDAR2009 Handwriting Segmentation Contest”, 10th International Conference on Document Analysis and Recognition (ICDAR'09), pp. 1393-1397, Barcelona, Spain, July 2009.Download Paper

  • The Handwriting Segmentation Contest was organized in the context of ICDAR2009 conference in order to record recent advances in off-line handwriting segmentation. This paper describes the contest details including the dataset, the ground truth and the evaluation criteria and presents the results of the 12 participating methods. The contest includes handwritten document images produced by many writers in several languages (English, French, German and Greek). These images are manually annotated in order to produce the ground truth which corresponds to the correct text line and word segmentation result. For the evaluation, a well established approach is used based on counting the number of matches between the entities detected by the segmentation algorithm and the entities in the ground truth.
    1. B.M.K. Sharma and V.S. Dhaka, “Segmentation of handwritten words using structured support vector machine”, Pattern Analysis and Applications, DOI: https://doi.org/10.1007/s10044-019-00843-x, 2019
    2. M.A. Garcia-Calderon, R.A. Garcia-Hernandez and Y. Ledeneva, “Providing order to the handwritten TLS task: A complexity index”, Journal of Intelligent and Fuzzy Systems, vol. 36, no. 5, pp. 4621-4631, 2019
    3. M. Pastor, "Text baseline detection, a single page trained system", Pattern Recognition, vol. 94, pp. 149-161, 2019
    4. G. Nagendar, V. Ranjan, G. Harit and C.V. Jawahar, "Efficient query specific dtw distance for document retrieval with unlimited vocabulary", Journal of Imaging, vol. 4, no. 2, 2018
    5. A. Pradhan, S. Behera and P. Pujari, "Comparative study on recent text line segmentation methods of unconstrained handwritten scripts", International Conference on Energy, Communication, Data Analytics and Soft Computing (ICECDS'17), pp. 3853-385, pp. 877-884, 2017
    6. M. Yashoda, S.K. Niranjan and V.N.M. Aradhya, "eLL: Enhanced Linked List---An Approach for Handwritten Text Segmentation", Fourth International Conference on Information Systems Design and Intelligent Applications (INDIA'17), pp. 877-884, 2017
    7. Y. Akbari, M.J. Jalili, J. Sadri, K. Nouri, I. Siddiqi and C. Djeddi, "A novel database for automatic processing of Persian handwritten bank checks", Pattern Recognition, vol 74, pp. 253-265, 2018
    8. H. Jain and A.P. Kumar, "A Bottom Up Procedure for Text Line Segmentation of Latin Script", International Conference on Advances in Computing, Communications and Informatics (ICACCI'17), 2017
    9. S.M. Obaidullah, C. Halder, K.C. Santosh, N. Das and K. Roy, “PHDIndic_11: page-level handwritten document image dataset of 11 official Indic scripts for script identification”, Multimedia Tools and Applications, DOI: 0.1007/s11042-017-4373-y, 2017
    10. P. Sahare and S.B. Dhok, "Review of Text Extraction Algorithms for Scene-text and Document Images", IETE Technical Review, vol 34, no. 2, pp. 144-164, 2017
    11. A. Abliz, W. Simayi, K. Moydin and A. Hamdulla, “Survey on Methods for Basic Unit Segmentation in Off-Line Handwritten Text Recognition”, International Journal of Future Generation Communication and Networking vol. 9, no. 11, pp. 137- 152, 2016
    12. K. Kadam, D. Phadatare, A. Mali and P. Nimbalkar, P Gode, “Detection of Word by Inter - Intra Gap Technique for Handwritten Documents”, International Journal of Advanced Research in Computer Engineering & Technology (IJARCET), vol. 5, no. 4, pp. 936-939, 2016
    13. P. Choudhary and N. Nain, “A Four-Tier Annotated Urdu Handwritten Text Image Dataset for Multidisciplinary Research on Urdu Script”, ACM Transactions on Asian and Low-Resource Language Information Processing, vol. 15, no. 4, article no. 26, 2016
    14. N. Aouadi and A. Kacem, “A proposal for touching component segmentation in Arabic manuscripts”, Pattern Analysis and Applications, doi="10.1007/s10044-016-0543-1, 2016
    15. P. Barlas, D. Hebert, C. Chatelain, S. Adam and T. Paquet, “Language identification in document images”, Journal of Imaging Science and Technology, vol. 60, no. 1, article number 010407, 2016
    16. A. Joshi and D. Bharadwaj, “A segmentation approach based on structured learning for recognition preprocessing”, International Conference on Automatic Control and Dynamic Optimization Techniques (ICACDOT'16) , pp. 935-939, 2016
    17. A. Masomi, H.R. Ghafari, K. Nouri, Y. Akbari, W. Bouamra and C. Djeddi, “A new database for writer demographics attributes detection based on off-line Persian and English handwriting”, 1st Mediterranean Conference on Pattern Recognition and Artificial Intelligence ( MedPRAI'16) , pp. 125-130, 2016
    18. N. Aouadi, A.K. Echi and A. Belaid, “A Recognition based Approach for segmenting Touching Components in Arabic Manuscripts”, 13th International Conference on Document Analysis and Recognition (ICDAR'15), pp. 21-25, Nancy, France, 2015
    19. J. Ryu, H.I. Koo and N.I. Cho, “Word segmentation method for handwritten documents based on structured learning”, IEEE Signal Processing Letters, vol. 22, no. 8, pp. 1161-1165, 2015
    20. S. Pannirselvam and S. Ponmani, “A Novel Hybrid Model For Tamil Handwritten Character Segmentation”, International Journal of Scientific & Enginee ring Research, vol. 5, no. 11, pp. 271-275, 2014
    21. N. Aouadi, A. Kacem, A. Belaïd, "Segmentation of Touching Component in Arabic Manuscripts", 4th International Conference on Frontiers in Handwriting Recognition (ICFHR'14), pp. 452-457, Creta, Grecce, September 2014
    22. D. Hebert, P. Barlas, C. Chatelain, S. Adam and T. Paquet, "Writing type and language identification in heterogeneous and complex documents", 4th International Conference on Frontiers in Handwriting Recognition (ICFHR'14), pp. 411-416, Creta, Grecce, September 2014
    23. Y. Tang, X. Wu, and W. Bu, “Text Line Segmentation Based on Matched Filtering and Top-Down Grouping for Handwritten Documents”, 11th IAPR International Workshop on Document Analysis Systems (DAS'14), Tours, France, pp. 365-369, 2014
    24. A. Fischer, M. Baechler, A. Garz, M. Liwicki and R. Ingold, “A Combined System for Text Line Extraction and Handwriting Recognition in Historical Documents”, 11th IAPR International Workshop on Document Analysis Systems (DAS'14), Tours, France, pp. 71-75, 2014
    25. J. Ryu, H.I. Koo and N.I. Cho, “Language-Independent Text-Line Extraction Algorithm for Handwritten Documents”, IEEE Signal Processing Letters, vol. 21, no. 9, pp. 1115-1119, 2014
    26. D. Fernández-Mota, J. Lladós and A. Fornés, “A graph-based approach for segmenting touching lines in historical handwritten documents”, International Journal on Document Analysis and Recognition, vol. 17, no. 3, pp. 293-312, 2014
    27. A. Lemaitre, J. Camillerapp and B. Coüasnon, “Handwritten text segmentation using blurred image”, Proceedings of SPIE - The International Society for Optical Engineering, vol. 9021, article number 90210D, Document Recognition and Retrieval XXI, San Francisco, United States, 2014
    28. Y. Wu, S. Zha, H. Cao, D. Liu, and P. Natarajan, “A Markov chain based line segmentation framework for handwritten character recognition”, Proceedings of SPIE - The International Society for Optical Engineering, vol. 9021, article number 90210C, Document Recognition and Retrieval XXI, San Francisco, United States, 2014
    29. M. Diem, F. Kleber, S. Fiel, and R. Sablatnig, “Semi-automated document image clustering and retrieval”, Proceedings of SPIE - The International Society for Optical Engineering, vol. 9021, article number 90210M, Document Recognition and Retrieval XXI, San Francisco, United States, 2014
    30. F. Cruz and O.R. Terrades, “Handwritten Line Detection via an EM algorithm”, 12th International Conference on Document Analysis and Recognition (ICDAR'13), pp. 718-722, Washington DC, USA, August 2013
    31. M. Diem, F. Kleber and R. Sablatnig, “Text Line Detection for Heterogeneous Documents”, 12th International Conference on Document Analysis and Recognition (ICDAR'13), pp. 743-747, Washington DC, USA, August 2013
    32. B. Moysset and C. Kermorvant, “On the evaluation of handwritten text line detection algorithms”, 12th International Conference on Document Analysis and Recognition (ICDAR'13), pp. 185-189, Washington DC, USA, August 2013
    33. I. Rabaev, O. Biller, J. El-Sana, K. Kedem and I. Dinstein, “Text Line Detection in Corrupted and Damaged Historical Manuscripts”, 12th International Conference on Document Analysis and Recognition (ICDAR'13), pp. 812-816, Washington DC, USA, August 2013
    34. X. Peng, H. Cao, S. Setlur, V. Govindaraju and P. Natarajan, “Multilingual OCR research and applications: an overview”, 4th International Workshop on Multilingual OCR (MOCR'13), Washington DC, USA, August 2013
    35. L. Kang, J. Kumar, P. Ye and D. Doermann, “Learning text-line segmentation using codebooks and graph partitioning”, 13th International Conference on Frontiers in Handwriting Recognition (ICFHR'12), pp. 63-68, Bari, Italy, September 2012
    36. I.B. Messaoud, H. Amiri, H.E. Abed and V. Märgner, “A multilevel text line segmentation framework for handwritten historical documents”, 13th International Conference on Frontiers in Handwriting Recognition (ICFHR'12), pp. 515-520, Bari, Italy, September 2012
    37. C. Djeddi, I. Siddiqi, L. Souici-Meslati and A. Ennaji, “Multi-script Writer Identification Optimized With Retrieval Mechanism”, 13th International Conference on Frontiers in Handwriting Recognition (ICFHR'12), pp. 507-512, Bari, Italy, September 2012
    38. A. Alaei, U. Pal and P. Nagabhushan, “Dataset and ground truth for handwritten text in four different scripts”, International Journal of Pattern Recognition and Artificial Intelligence, vol. 26, no. 4, Article number 1253001, 2012
    39. R. Sarkar, N. Das, S. Basu, M. Kundu, M. Nasipuri and D.K. Basu, “CMATERdb1: a database of unconstrained handwritten Bangla and Bangla–English mixed script document image”, International Journal on Document Analysis and Recognition, vol. 15, no. 1, pp. 71-83, 2012
    40. L. Kang and D. Doermann, “Template based Segmentation of Touching Components in Handwritten Text Lines”, 11th International Conference on Document Analysis and Recognition (ICDAR'11), pp. 569-573, Beijing, China, September 2011
    41. F. Simistira, V. Papavassiliou, T. Stafylakis and V. Katsouros, “Enhancing Handwritten Word Segmentation by Employing Local Spatial Features”, 11th International Conference on Document Analysis and Recognition (ICDAR'11), pp. 1314-1318, Beijing, China, September 2011
    42. V. Manohar, S.N. Vitaladevuni, H. Cao, R. Prasad and P. Natarajan, “Graph Clustering-based Ensemble Method for Handwritten Text Line Segmentation”, 11th International Conference on Document Analysis and Recognition (ICDAR'11), pp. 574-578, Beijing, China, September 2011
    43. Y. Gao, X. Ding and C. Liu, “A Multi-scale Text Line Segmentation Method in Freestyle Handwritten Documents”, 11th International Conference on Document Analysis and Recognition (ICDAR'11), pp. 643-647, Beijing, China, September 2011
    44. A. Alaei, P. Nagabhushan and U. Pal, “A Benchmark Kannada Handwritten Document Dataset and its Segmentation”, 11th International Conference on Document Analysis and Recognition (ICDAR'11), pp. 141-145, Beijing, China, September 2011
    45. J. Kumar, L. Kang, D. Doermann and W. Abd-Almageed, “Segmentation of Handwritten Textlines in Presence of Touching Components”, 11th International Conference on Document Analysis and Recognition (ICDAR'11), pp. 109-113, Beijing, China, September 2011
    46. A. Alaei, P. Nagabhushan and U. Pal, “Piece-wise painting technique for line segmentation of unconstrained handwritten text: a specific study with Persian text documents”, Pattern Analysis & Applications, vol. 14, no. 4, pp. 381-394, 2011
    47. A. Alaei, P. Nagabhushan and U. Pal, “A new dataset of Persian handwritten documents and its segmentation”, 7th Iranian Conference on Machine Vision and Image Processing (MVIP'11), pp. 1-5, Tehran, Iran, November 2011
    48. T.D. Nguyen and G. Lee, “Text line segmentation in handwritten document images using tensor voting”, IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences, vol. E94-A, no. 11, pp. 2434-2441, 2011
    49. E. Kavallieratou and F. Daskas, “Text Line Detection and Segmentation: Uneven Skew Angles and Hill-and-Dale Writing”, Journal of Universal Computer Science, vol. 17, no. 1, pp. 16-29, 2011
    50. A. Lemaitre, J. Camillerapp and B. Coüasnon, “A perceptive method for handwritten text segmentation”, "Document recognition and retrieval XVIII - Electronic Imaging, San Francisco, United States, Article number 78740C, January 2011
    51. A. Alaei, U. Pal and P. Nagabhushan, “A new scheme for unconstrained handwritten text-line segmentation”, Pattern Recognition, vol. 44, no. 4, pp. 917-928, 2011
    52. P. Nagabhushan and A. Alaei, "Tracing and Straightening the Baseline in Handwritten Persian/Arabic Text-line: A New Approach Based on Painting-technique", International Journal on Computer Science and Engineering (IJCSE), vol. 2, no. 04, pp. 907-916, 2010
    53. V. Papavassiliou, V. Katsouros and G. Carayannis, “A Morphological Approach for Text-Line Segmentation in Handwritten Documents”, 12th International Conference on Frontiers in Handwriting Recognition (ICFHR'10), pp. 19-24, Kolkata, India, November 2010

  • N. Stamatopoulos, G. Louloudis and B. Gatos, “A Comprehensive Evaluation Methodology for Noisy Historical Document Recognition Techniques”, 3rd Workshop on Analytics for Noisy Unstructured Text Data (AND'09), pp. 47-54, Barcelona, Spain, July 2009.Download Paper

  • In this paper, we propose a new comprehensive methodology in order to evaluate the performance of noisy historical document recognition techniques. We aim to evaluate not only the final noisy recognition result but also the main intermediate stages of text line, word and character segmentation. For this purpose, we efficiently create the text line, word and character segmentation ground truth guided by the transcription of the historical documents. The proposed methodology consists of (i) a semi-automatic procedure in order to detect the text line, word and character segmentation ground truth regions making use of the correct document transcription, (ii) calculation of proper evaluation metrics in order to measure the performance of the final OCR result as well as of the intermediate segmentation stages. The semi-automatic procedure for detecting the ground truth regions has been evaluated and proved efficient and time saving. Experimental results prove that using the proposed technique, the percentage of time saved for the text line, word and character segmentation ground truth creation is more than 90%. An analytic experiment using a commercial OCR engine applied to a historical book is also presented.
    1. C. Biswas, P.S. Mukherjee, K. Ghosh, U. Bhattacharya and S.K. Parui, “A Hybrid Deep Architecture for Robust Recognition of Text Lines of Degraded Printed Documents”, International Conference on Pattern Recognition (ICPR'18), pp. 3174-3179, 2018
    2. F.H.F. Wu, “Applying Machine Learning in Optical Music Recognition of Numbered Music Notation”, International Journal of Multimedia Data Engineering and Management (IJMDEM), vol. 8, no. 3, 2017
    3. R.C. Carrasco, “An open-source OCR evaluation tool”, First International Conference on Digital Access to Textual Cultural Heritage (DATeCH '14), pp. 179-184, 2014
    4. T. Shima, K. Terasawa and T. Kawashima, “Image Processing for Historical Newspaper Archives”, 1st Workshop on Historical Document Imaging and Processing (HIP'11), pp. 127-132, Beijing, China, September 2011

  • N. Stamatopoulos, B. Gatos, I. Pratikakis and S.J. Perantonis, “A Two-Step Dewarping of Camera Document Images”, 8th International Workshop on Document Analysis Systems (DAS'08), pp. 209-216, Nara, Japan, September 2008.Download Paper

  • Dewarping of camera document images has attracted a lot of interest over the last few years since warping not only reduces the document readability but also affects the accuracy of an OCR application. In this paper, a two-step approach for efficient dewarping of camera document images is presented. At a first step, a coarse dewarping is accomplished with the help of a transformation model which maps the projection of a curved surface to a 2D rectangular area. The projection of the curved surface is delimited by the two curved lines which fit the top and bottom text lines along with the two straight lines which fit to the left and right text boundaries. At a second step, fine dewarping is achieved based on words detection. All words are pose normalized guided by the lower and upper word baselines. Experimental results on several camera document images demonstrate the robustness and effectiveness of the proposed technique.
    1. V.K.B. Ramanna, S. Bukhari and A. Dengel, “Document image dewarping using deep learning”, 8th International Conference on Pattern Recognition Applications and Methods (ICPRAM'19), pp. 524-531, 2019
    2. L.M. Laskov, “Methods for document image de-warping”, Astronomical and Astrophysical Transactions, vol. 30, no. 4, pp. 511-522, 2018
    3. R. Sun, S. Wang, L. Ji and Z. Wang, “Multi-scale document image rectification utilising text-features”, Electronics Letters, vol. 54, no. 8, pp. 502-503, 2018
    4. H.C. Vinod and S.K. Niranjan, “ De-warping of camera captured document images”, 21st IEEE International Symposium on Consumer Electronics (ISCE'17), pp. 13-18, 2017
    5. H.I. Koo and N.I. Cho, “Document image rectification using single-view or two-view camera input”, Computational Photography: Methods and Applications (Book Chapter), pp. 313-338. 2017
    6. T. Kil, W. Seo, H.I. Koo and N.I. Cho, “Robust Document Image Dewarping Method Using Text-Lines and Line Segments”, 14th IAPR International Conference on Document Analysis and Recognition (ICDAR'17), pp. 865-870, 2017
    7. F. Bolelli, G. Borghia and C. Grana, "XDOCS: An Application to Index Historical Documents", Italian Research Conference on Digital Libraries and Multimedia Archives (IRCDL'18), 2018
    8. S.H. Lee, D. Kim, S. Jadhav and S. Lee , “A restoration method for distorted comics to improve comic contents identification”, International Journal on Document Analysis and Recognition (IJDAR),DOI https://doi.org/10.1007/s1003, 2017
    9. F. Bolelli, “Indexing of Historical Document Images: Ad Hoc Dewarping Technique for Handwritten Text”, 13th Italian Research Conference on Digital Libraries (IRCDL'17), pp. 45-55, 2017
    10. B.S Kim, H.I. Koo and N.I. Cho, “Document Dewarping via Text-line based Optimization”, Pattern Recognition, doi:10.1016/j.patcog.2015.04.026, 2015
    11. L. Galarza, Z. Wang and M. Adjouadi, “Book spread correction using a time of flight imaging sensor”, International Conference on Image Processing, Computer Vision, and Pattern Recognition (IPCV'14), pp. 250-254, Las Vegas, USA, August 2014
    12. D. Oliveira, R. Lins, G. Torreão, J. Fan and M. Thielo, “An Efficient Algorithm for Segmenting Warped Text-lines in Document Images”, 12th International Conference on Document Analysis and Recognition (ICDAR'13), pp. 250-254, Washington DC, USA, August 2013
    13. Y. He, P. Pan, S. Xie, J. Sun and S. Naoi, “A book dewarping system by boundary-based 3D surface reconstruction”, 12th International Conference on Document Analysis and Recognition (ICDAR'13), pp. 403-407, Washington DC, USA, August 2013
    14. L. Tong, G. Zhan, Q. Peng, Y. Li and Y. Li, “Warped Document Image Mosaicing Method Based on Inflection Point Detection and Registration”, 4th International Conference on Multimedia Information Networking and Security (MINES'12), pp. 306-310, Nanjing, Jiangsu, China, November 2012
    15. V. Kluzner and A. Tzadok, “Page Curling Correction for Scanned Books Using Local Distortion Information”, 11th International Conference on Document Analysis and Recognition (ICDAR'11), pp. 890-894, Beijing, China, September 2011
    16. M. Rahnemoonfar and A. Antonacopoulos, “Restoration of Arbitrarily Warped Historical Document Images Using Flow Lines”, 11th International Conference on Document Analysis and Recognition (ICDAR'11), pp. 905-909, Beijing, China, September 2011
    17. C. Neudecker, Z.M. Dogan, S. Schlarb, P. Missier, S. Sufi, A. Williams and K. Wolstencroft, “An Experimental Workflow Development Platform for Historical Document Digitisation and Analysis”, 1st Workshop on Historical Document Imaging and Processing (HIP'11), pp. 161-168, Beijing, China, September 2011
    18. D.M. Oliveira, R.D. Lins, G. Torreão, J. Fan and M. Thielo, “A new algorithm for segmenting warped text-lines in document images”, ACM Symposium on Applied Computing (SAC'11), pp. 259-265, March 2011
    19. S.S. Bukhari, F. Shafait and T.M. Breuel, “Performance Evaluation of Curled Textlines Segmentation Algorithms”, 9th International Workshop on Document Analysis Systems (DAS'10), (short paper), pp. 555-558, Boston, MA, USA, June 2010
    20. R.D. Lins, D.M. Oliveira, G. Torreao, J. Fan and M. Thielo, “Correcting Book Binding Distortion in Scanned Documents”, 7th International Conference on Image Analysis and Recognition (ICIAR'10), pp. 355-365, Póvoa de Varzin, Portugal, June 2010
    21. D.M. Oliveira, R.D. Lins, G. Torreao, J. Fan and M. Thielo, “A New Method for Text-Line Segmentation for Warped Documents”, 7th International Conference on Image Analysis and Recognition (ICIAR'10), pp. 398-408, Póvoa de Varzin, Portugal, June 2010
    22. H.I. Koo and N.I. Cho, “State Estimation in a Document Image and Its Application in Text Block Identification and Text Line Extraction”, 11th European conference on Computer vision (ECCV'10), pp. 421-434, Heraklion, Crete, Greece, September 2010
    23. S.S. Bukhari, T.M. Breuel and F. Shafait, “Textline information extraction from grayscale camera-captured document images”, 16th International Conference on Image Processing (ICIP'09), pp. 2013-2016, Cairo, November 2009
    24. S.S. Bukhari, F. Shafait and T.M. Breuel, “Ridges based Curled Textline Region Detection from Grayscale Camera-Captured Document Images”, 13th International Conference on Computer Analysis of Images and Patterns (CAIP'09), pp. 173-180, Münster , Germany, September 2009
    25. S.S. Bukhari, F. Shafait and T.M. Breuel, “Coupled Snakelet Model for Curled Textline Segmentation of Camera-Captured Document Images”, 10th International Conference on Document Analysis and Recognition (ICDAR'09), pp. 61-65, Barcelona, Spain, July 2009
    26. S.S. Bukhari, F. Shafait and T.M. Breuel, “Dewarping of document images using coupled-snakes”, International Workshop on Camera-Based Document Analysis and Recognition (CBDAR'09), pp. 34-41, Barcelona, Spain, July 2009

  • G. Vamvakas, B. Gatos, N. Stamatopoulos and S.J. Perantonis, “A Complete Optical Character Recognition Methodology for Historical Documents”, 8th International Workshop on Document Analysis Systems (DAS'08), pp. 525-532, Nara, Japan, September 2008.Download Paper

  • In this paper a complete OCR methodology for recognizing historical documents, either printed or handwritten without any knowledge of the font, is presented. This methodology consists of three steps: The first two steps refer to creating a database for training using a set of documents, while the third one refers to recognition of new document images. First, a pre-processing step that includes image binarization and enhancement takes place. At a second step a top - down segmentation approach is used in order to detect text lines, words and characters. A clustering scheme is then adopted in order to group characters of similar shape. This is a semi-automatic procedure since the user is able to interact at any time in order to correct possible errors of clustering and assign an ASCII label. After this step, a database is created in order to be used for recognition. Finally, in the third step, for every new document image the above segmentation approach takes place while the recognition is based on the character database that has been produced at the previous step.
    1. S.K. Satapathy, S. Mishra, R.S. Sundeep, U.S.R. Teja, P.K. Mallick, M. Shruti and K. Shravya, "Deep learning based image recognition for vehicle number information", International Journal of Innovative Technology and Exploring Engineering, vol. 8, no. 8, pp. 52-55, 2019
    2. J. Shentu and M. Zheng, "Mechanism design of data management system for nuclear power", Annals of Nuclear Energy, vol. 129, pp. pp. 21-29, 2019
    3. S.T. Deokate and N.J. Uke, "Devnagari Script Categorization by Utilizing CNN and KNN", International Journal of Innovative Technology and Exploring Engineering (IJITEE), vol. 8, no. 5, pp. 1136-1140, 2019
    4. N. Babu and A. Soumya, "Character Recognition in Historical Handwritten Documents–A Survey", International Conference on Communication and Signal Processing (ICCSP'19), pp. 299-304, 2019
    5. D.M. Kassa and H. Hagras, “An Adaptive Segmentation Technique for the Ancient Ethiopian Ge'ez Language Digital Manuscripts”, 10th Computer Science and Electronic Engineering Conference (CEEC'18), pp. 83-88, 2018
    6. C. Biswas, P.S. Mukherjee, K. Ghosh, U. Bhattacharya and S.K. Parui, “A Hybrid Deep Architecture for Robust Recognition of Text Lines of Degraded Printed Documents”, International Conference on Pattern Recognition (ICPR'18), pp. 3174-3179, 2018
    7. S. Choudhary, N.K. Singh and S. Chichadwani, "Text Detection and Recognition from Scene Images using MSER and CNN", 2nd International Conference on Advances in Electronics, Computers and Communications (ICAECC'18), no. 8479419, 2018
    8. P. Sharma, "A Survey on Optical Character Recognition Techniques", International Journal of Management, Technology And Engineering, vol 8, pp. 2889-2895, 2018
    9. P. Chaturvedi, M. Saxena and B. Sharma, "A Bounding Box Approach for Performing Dynamic Optical Character Recognition in MATLAB", International Conference on Emerging Trends in Expert Applications & Security (ICETEAS 2018, pp. 117-123, Jaipur, India, 2018
    10. P. Kumari and A. Kalia, "A Comparative study of GOCR, Tesseract and Improved Tesseract for Character Recognition", International Journal of Technical Innovation in Modern Engineering & Science (IJTIMES), vol. 4, no. 10, pp. 345-352, 2018
    11. K. Kang and H. Xie, "Design and Implementation of Driver's License Recognition System", 13th International Conference on Computer Science & Education (ICCSE'18), pp. 140-143, 2018
    12. A. Farhat, O. Hommos, A. Al-Zawqari A. Al-Qahtani, F. Bensaali, A. Amira and X. Zhai, "Optical character recognition on heterogeneous SoC for HD automatic number plate recognition system", Eurasip Journal on Image and Video Processing, vol. 2018, no. 1, 2018
    13. B. Arizanović and V. Vučković, "Efficient Compression and Decompression Algorithms for OCR Systems", Facta Universitatis, Series: Electronics and Energetics, vol. 31, no. 3, pp. 461-485, 2018
    14. D. Khurana and M. Malik, "Number Plate Detection: A Complete Review", International Journal of Engineering Technology and Computer Research (IJETCR), vol. 6, no. 3, pp. 4-8, 2018
    15. F.D. Nurzam and E.T. Luthfi, "Implementation of Real-Time Scanner Java Language Text with Mobile Vision Android Based", International Conference on Information and Communications Technology (ICOIACT'18), pp. 724-729, 2018
    16. J. Neema, M.C. Merin, M.J. Niya and T. Tresa, "Panulat-An Automated Pen", International Journal of Current Engineering and Scientific Research (IJCESR), vol 5, no. 3, pp. 22-26, 2018
    17. G. Kotzé and F. Wolff, "Developing and evaluating a pipeline for Setswana OCR", Pattern Recognition Association of South Africa and Robotics and Mechatronics (PRASA-RobMech), pp. 236-241, 2017
    18. V. Vučković and B. Arizanović, "General Character Segmentation Approach for Machine-Typed Documents", 4th International Conference on Electrical, Electronic and Computing Engineering (ETRAN'17), pp. RTI2.2.1-6, 2017
    19. S. Garg and N. Mishra, “Pollution Check Control Using License Plate Extraction via Image Processing”, Soft Computing: Theories and Applications (SoCTA), pp. 133-146, 2017
    20. H. Modi and M.C. Parikh, “A Review on Optical Character Recognition Techniques”, International Journal of Computer Applications, vol. 160, no. 6, pp. 20-24, 2017
    21. A.A.H.O. Idris and I. Khirwar, “Number plate recognition: A brief overview”, International Journal For Technological Research In Engineering, vol. 4, no. 7, pp. 1023-1027, 2017
    22. V. Vučković and B. Arizanović, "Efficient Character Segmentation Approach for Machine-Typed Documents", Expert Systems with Applications, http://dx.doi.org/10.1016/j.eswa.2017.03.027, 2017
    23. V. Tumane, D. Chaurpagar, A. Somkuwar, G. Sonone and S. Marbade, “A novel approach for image cropping and automatic contactexraction from images”, International Journal of Research In Science & Engineering, vol. 3, no. 2, pp. 271-278, 2017
    24. P. Satyanarayana, K. Sujitha, V.S.A. Kiron, P.A. Reddy and M. Ganesh, “Assistance Vision for Blind People Using k-NN Algorithm and Raspberry Pi”, 2nd International Conference on Micro-Electronics, Electromagnetics and Telecommunications (ICMEET'16), pp. 113-122, 2016
    25. S. Deokate and N. Uke , “Various Traditional and Nature Inspired Approaches Used in Image Preprocessing”, International Conference on Advanced Technologies for Societal Applications (ICATSA'16), 2016
    26. S. Chaudhary, R. Malhotra, M. Jaiswal, S. Gupta and R. Ahuja, “An Android app OCR+: for Text Translator, Document Editor, Business Card Reader & Equation Solver”, International Journal of Engineering Applied Sciences and Technology, vol. 1, no. 7, pp. 92 - 95, 2016
    27. A. Farhat, A. Al-Zawqari, A. Al-Qahtani, O. Hommos, F. Bensaali, A. Amira and X. Zhai, “OCR based feature extraction and template matching algorithms for Qatari number plate”, International Conference on Industrial Informatics and Computer Systems (CIICS'16), Sharjah, pp. 1-5, 2016
    28. G. Agre, J. Pimple, V. Bhavsar, V. Sarode and P. Dhande, “Optimized search engine to find image by providing keyword”, International Journal of Technical Research and Applications, vol. 4, no. 2, pp. 68-71, 2016
    29. R. Hussain, A. Masood, H.A. Khan, K. Khurshid and I. Siddiqi, “Language Independent Keyword Based Information Retrieval System of Handwritten Documents using SVM Classifier and Converting Words into Shapes”, Pakistan Journal of Engineering and Applied Sciences, vol. 19, pp. 63 - 76, 2016
    30. M.A. Agrawal and M.P. Brijpuria, “A Dynamic Object Identification Protocol for Intelligent Robotic Systems”, Internation Journal of Image, Graphics and Signal Processing (IJIGSP), vol. 7, no. 8, pp. 35-41, 2015
    31. S.M. Aswatha, A.N. Talla, J. Mukhopadhyay and P. Bhowmick, “A method for extracting text from stone inscriptions using character spotting”, 12th Asian Conference on Computer Vision (ACCV'14), pp. 598-611, 2014
    32. W. Pantke, A. Haak and V. Margner, “Color segmentation for historical documents using Markov random fields”, 6th International Conference on Soft Computing and Pattern Recognition (SoCPaR'14), pp: 151-156, 2014
    33. F. Hollaus, S. Fiel, S. Saleem, R. Sablatnig and A. Camba, “Manuscript Investigation in the Sinai II Project”, Digital Presentation and Preservation of Cultural and Scientific Heritage (Digital Presentation and Preservation of Cultural and Scientific Heritage), issue: IV, pp: 200-205, 2014
    34. S. Saleem, F. Hollaus and R. Sablatnig, “Recognition of degraded ancient characters based on dense SIFT”, First International Conference on Digital Access to Textual Cultural Heritage (DATeCH '14), pp. 15-20, 2014
    35. K. Fouladi, B.N. Araabi and E. Kabir, “A fast and accurate contour-based method for writer-dependent offline handwritten Farsi/Arabic subwords recognition”, International Journal on Document Analysis and Recognition, vol. 17, no 2, pp. 181-203, 2014
    36. D.S. Patil and M.S. Patel, “Simple and Fast Method for Offline English Handwritten Word Recognition”, Transactions on Electrical and Electronics Engineering (ITSI - TEEE), vol. 1, no. 2, pp. 98-100, 2013
    37. A. Ul-Hasan, S.S. Bukhari, S.F. Rashid, F. Shafait and T.M. Breuel, “Semi-automated OCR database generation for Nabataean scripts”, 21st International Conference on Pattern Recognition (ICPR 2012), pp. 1667-1670, Tsukuba, Japan, November 2012
    38. Y. Chherawala and M. Cheriet, “W-TSV: Weighted topological signature vector for lexicon reduction in handwritten Arabic documents”, Pattern Recognition, vol. 45, no. 9, pp. 3277-3287, 2012
    39. T. Blanke, M. Bryant and M. Hedges, “Open source optical character recognition for historical research”, Journal of Documentation, vol. 68, no. 5, pp. 659-683, 2012
    40. M. Diem, and R. Sablatnig, “Are Characters Objects?”, 12th International Conference on Frontiers in Handwriting Recognition (ICFHR'10), pp. 565-570, Kolkata, India, November 2010
    41. C. Colutto, “Introducing a new image dissimilarity measure with an application to character image clustering in degraded historical documents”, 9th International Workshop on Document Analysis Systems (DAS'10), pp. 325-332, Boston, MA, USA, June 2010
    42. D.R. Lee and S. Oh, “Minimum-Cost Path Algorithm for Separating Touching Characters”, 7th IASTED International Conference on Signal Processing, Pattern Recognition, and Applications (SPPRA'10), pp. 164-168, Innsbruck, Austria, February 2010
    43. M. Diem, and R. Sablatnig, “Recognizing characters of ancient manuscripts”, Proceedings of SPIE - The International Society for Optical Engineering, vol. 7531, article number 753106, January 2010

  • N. Stamatopoulos, B. Gatos and S.J. Perantonis, “A Method for Combining Complementary Techniques for Document Image Segmentation”, 11th International Conference on Frontiers in Handwriting Recognition (ICFHR'08), pp. 235-240, Montreal, Canada, August 2008.Download Paper

  • Image segmentation is a major task of handwritten document processing. Many of the proposed techniques for image segmentation are complementary, in the sense that each of them using a different approach, can solve different difficult problems such as overlapping, touching components, influence of author style etc. In this paper a combination method of different segmentation techniques is presented. Our goal is to exploit the segmentation results of complementary techniques and specific features of the initial image so as to generate improved segmentation results. Experimental results on handwriting line segmentation methods demonstrate the effectiveness of the proposed combination method.
    1. E. Kavallieratou and F. Daskas, “Text Line Detection and Segmentation: Uneven Skew Angles and Hill-and-Dale Writing”, Journal of Universal Computer Science, vol. 17, no. 1, pp. 16-29, 2011

  • N. Stamatopoulos, B. Gatos and A. Kesidis, "Automatic Borders Detection of Camera Document Images", 2nd International Workshop on Camera-Based Document Analysis and Recognition (CBDAR'07), pp.71-78, Curitiba, Brazil, September 2007.Download Paper

  • When capturing a document image through a digital camera are often framed by a noisy black border or include noisy text regions from neighbouring pages. In this paper, we present a novel technique for enhancing the document images are captured by a digital camera by automatically detecting the document borders and cutting out noisy black borders as well as noisy text regions appearing from neighbouring pages. Our methodology is based on projection profiles combined with a connected component labelling process. Signal cross-correlation is also used in order to verify the detected noisy text areas. Experimental results on several camera document images, mainly historical, documents indicate the effectiveness of the proposed technique.
    1. A. Kordecki, “Fast document area detection for scanned images”, Proceedings of SPIE - The International Society for Optical Engineering, 11041, art. no. 1104120., 2019
    2. A. Zhu, C. Zhang, Z. Li and S. Xiong, “Coarse-to-fine document localization in natural scene image with regional attention and recursive corner refinement”, International Journal on Document Analysis and Recognition (IJDAR), DOI: https://doi.org/10.1007/s10032-019-00341-0, 2019
    3. M.M. Reza, M.A. Rakib, S.S. Bukhari and A. Dengel, “A Robust Page Frame Detection Method for Complex Historical Document Images”, 8th International Conference on Pattern Recognition Applications and Methods. International Conference on Pattern Recognition Applications and Methods (ICPRAM-2019), 2019
    4. A. Kordecki, "Fast document area detection for scanned images", Eleventh International Conference on Machine Vision (ICMV'18), 1104120, 2018
    5. S.A. Jain, N.S. Rani and N. Chandan, “Image Enhancement of Complex Document Images Using Histogram of Gradient Features”, International Journal of Engineering & Technology, vol. 7, no. 4.36, pp. 780-783, 2018
    6. K.M. Hung, C.H. Yih and C.H. Yeh, “A Reading Assistant System Based on Restoring Warped Document Image”, Journal of Applied Science and Engineering, vol. 21, no. 3, pp. 475-484, 2018
    7. S. Dey, B. Mitra, J. Mukhopadhyay and S. Sural, “A Comparative Study of Margin Noise Removal Algorithms on MarNR: A Margin Noise Dataset of Document Images”, 11st International Workshop on Open Services and Tools for Document Analysis (ICDAR-OST'17), pp. 35-39, 2017
    8. K. Javed and F. Shafait, “Real-Time Document Localization in Natural Images by Recursive Application of a CNN”, 14th IAPR International Conference on Document Analysis and Recognition (ICDAR'17), pp. 105-110, 2017
    9. C. Adak, B.B. Chaudhuri and M. Blumenstein, “Legibility and Aesthetic Analysis of Handwriting”, 14th IAPR International Conference on Document Analysis and Recognition (ICDAR'17), pp. 175-182, 2017
    10. S. Prum, “Text-zone detection and rectification in document images captured by smartphone”, 1st EAI International Conference on Computer Science and Engineering (COMPSE'16), 2017
    11. C. Adak, B.B. Chaudhuri and M. Blumenstein, “Writer identification by training on one script but testing on another”, 23rd International Conference on Pattern Recognition (ICPR'16), pp. 1153-1158, 2016
    12. S. He and L. Schomaker, “Writer identification using curvature-free features”, Pattern Recognition, DOI: 10.1016/j.patcog.2016.09.044, 2016
    13. Z. Huang, J. Gu, G. Meng and C. Pan, "Text line extraction of curved document images using hybrid metric", 3rd IAPR Asian Conference on Pattern Recognition (ACPR), Malaysia, pp. 251-255, 2016
    14. D. Karatzas, V. Poulain d’Andecy, M. Rusinol, A. Chica and P.P. Vazque, “Human-Document Interaction systems - a new frontier for document image analysis”, 12th Workshop on Document Analysis Systems (DAS'16), pp. 369-374, Santorini, Greece, 2016
    15. A. Chakraborty and M. Blumenstein, “Marginal Noise Reduction in Historical Handwritten Documents - A Survey”, 12th Workshop on Document Analysis Systems (DAS'16), pp. 323-328, Santorini, Greece, 2016
    16. T. Mondal, N. Ragot, J.Y. Ramel and U. Pal, “Performance Evaluation of DTW and its Variants for Word Spotting in Degraded Documents”, 13th International Conference on Document Analysis and Recognition (ICDAR'15), pp. 1141-1145, Nancy, France, 2015
    17. M. Villegas, J.A. Sanchezand and E. Vidal, “Optical Modelling and Language Modelling Trade-off for Handwritten Text Recognition”, 13th International Conference on Document Analysis and Recognition (ICDAR'15), pp. 831-835, Nancy, France, 2015
    18. M. Wagdy, I. Faye and D. Rohaya, “Border noise removal from the document image using X-Y cut and filtering technique based on morphological operation”, International Journal of Imaging and Robotics, vol.15, no. 3, pp. 88-105, 2015
    19. M. Liu, C. Li, W. Zhu and A. Lim, “A morphology-based border noise removal method for camera-captured label images”, 5th International Workshop on Camera-Based Document Analysis and Recognition (CBDAR'13), pp. 126-138, Washington DC, USA, August 2013
    20. M. Wagdy, I. Faye and D. Rohaya, “Border Noise Removal and Clean Up Based on Retinex Theory”, 1st International Conference on Advanced Data and Information Engineering (DaEng-2013) Lecture Notes in Electrical Engineering Vol. 285, pp. 345-352, 2013
    21. M. Agrawal, and D. Doermann, “Clutter noise removal in binary document images”, International Journal on Document Analysis and Recognition (IJDAR) vol. 16, no. 4, pp. 351-369, 2013
    22. S. Kaur and P.S. Mann, “Improved XY cut Page Segmentation Algorithm for Border Noise”,International Journal of Computer Science & Engineering Technology (IJCSET), vol. 3, no. 5, pp. 149-151, 2013
    23. S. Kaur, P.S. Mann and S. Kaur, “Page Segmentation using XY Cut Algorithm in OCR System-A Review”, International Journal of Computers and Technology (IJCT), vol. 6, no. 3, pp. 436-440, 2013
    24. S. Kaur, P.S. Mann and S. Khurana, “Page Segmentation in OCR System-A Review”, International Journal of Computer Science and Information Technologies (IJCSIT), vol. 4, no. 2, pp. 420-422, 2013
    25. M. Shamqoli and H. Khosravi, “Border detection of document images scanned from large books”, 8th Iranian Conference on Machine Vision and Image Processing (MVIP 2013), Zanjan, Iran, pp. 84-88, 2013
    26. M. Shamqoli and H. Khosravi, “Warped document restoration by recovering shape of the surface”, 8th Iranian Conference on Machine Vision and Image Processing (MVIP 2013), Zanjan, Iran, pp. 262-265, 2013
    27. F. Shafait and T.M. Breuel, “The Effect of Border Noise on the Performance of Projection-Based Page Segmentation Methods”, IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), vol. 33, no. 4, pp. 846-851, 2011
    28. S.S. Bukhari, F. Shafait and T.M. Breuel, “Border Noise Removal of Camera-Captured Document Images using Page Frame Detection”, 4th International Workshop on Camera-Based Document Analysis and Recognition (CBDAR'11), Beijing, China, September 2011
    29. M.M. Haji, T.D. Bui and C.Y. Suen, “Simultaneous Document Margin Removal and Skew Correction Based on Corner Detection in Projection Profiles”, 15th International Conference on Image Analysis and Processing (ICIAP'09), pp. 1025-1034, Vietri sul Mare, Italy, September 2009
    30. F. Shafait, D. Keysers and T.M. Breuel, “Response to "Projection Methods Require Black Border Removal”, IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), vol. 31, no. 4, pp.763-764, 2009
    31. F. Shafait, J. Beusekom, D. Keysers and T.M. Breuel, “Document cleanup using page frame detection”, International Journal on Document Analysis and Recognition (IJDAR), vol. 11, no. 2, pp. 81-96, 2008

  • B. Gatos, A. Antonacopoulos and N. Stamatopoulos, “ICDAR2007 Handwriting Segmentation Contest”, 9th International Conference on Document Analysis and Recognition (ICDAR'07), pp. 1284-1288, Curitiba, Brazil, September 2007.Download Paper

  • This paper presents the results of the Handwriting Segmentation Contest that was organized in the context of ICDAR2007 conference. The aim of this contest was to use well established evaluation practices and procedures in order to record recent advances in off-line handwriting segmentation. Two benchmarking datasets (one for text line and one for word segmentation) were used in a common evaluation platform in order to test and compare all submitted algorithms for handwritten document segmentation in realistic circumstances. The results of the evaluation of five algorithms submitted by participants as well as of two state-of-the-art algorithms are presented. The performance evaluation method is based on counting the number of matches between the text lines or words detected by the algorithms and the text line or words of the ground truth.
    1. S. Kundu, S. Paul, S.K. Bera, A. Abraham and R. Sarkar, "Text-line Extraction from Handwritten Document Images using GAN", Expert Systems with Applications, vol. 140, 2020
    2. M.H.M. Dyla and F. Morain-Nicolier, "Text line segmentation and binarization of handwritten historical documents using the fast and adaptive bidimensional empirical mode decomposition", Optik, vol. pp. 52-63, 2019
    3. M. Pastor, "Text baseline detection, a single page trained system", Pattern Recognition, vol. 94, pp. 149-161, 2019
    4. H. Jain and A.P. Kumar, "A Bottom Up Procedure for Text Line Segmentation of Latin Script", International Conference on Advances in Computing, Communications and Informatics (ICACCI'17), 2017
    5. Renuka and S. Terdal, "Markov Random Field Region Based Text Detection and Segmentation by Stroke Width Transformation", International Journal for Innovative Research in Science & Technology, vol. 4, no.2, pp. 195-200, 2017
    6. P. Sahare and S.B. Dhok, "Review of Text Extraction Algorithms for Scene-text and Document Images", IETE Technical Review, DOI: 10.1080/02564602.2016.1160805, 2016
    7. N.V. Borse and I.R. Shaikh, “Text Extraction from Handwritten Documents”, International Journal Of Engineering, Education And Technology (ARDIJEET), vol. 3, no.2, 2015
    8. T. Saba1, A. Rehman, A. Altameem. and M. Uddin, “Annotated comparisons of proposed preprocessing techniques for script recognition”, Neural Computing and Applications, DOI: 10.1007/s00521-014-1618-9, 2014
    9. Y. Tang, X. Wu, and W. Bu, “Text Line Segmentation Based on Matched Filtering and Top-Down Grouping for Handwritten Documents”, 11th IAPR International Workshop on Document Analysis Systems (DAS'14), Tours, France, pp. 365-369, 2014
    10. S.S. Bukhari, F. Shafait and T.M. Breuel, “Towards Generic Text-Line Extraction”, 12th International Conference on Document Analysis and Recognition (ICDAR'13), pp. 748-752, Washington DC, USA, August 2013
    11. N. Modi and K. Jindal, “Text Line detection and Segmentation in Handwritten Gurumukhi Scripts”, International Journal of Advanced Research in Computer Science and Software Engineering, vol. 3, no. 5, pp. 1075-1080, 2013
    12. R. Sarkar, S, Halder, S. Malakar, N. Das, S, Basu and M. Nasipuri, “Text line extraction from handwritten document pages based on line contour estimation”, 3rd International Conference on Computing, Communication and Networking Technologies (ICCCNT 2012), Article number 6395873, Coimbatore, India, 2012
    13. S. Jindal and G.S. Lehal, “Line segmentation of handwritten Gurmukhi manuscripts”, Workshop on Document Analysis and Recognition (DAR 2012), pp. 74-78, Mumbai, 2012
    14. A. Alaei, U. Pal and P. Nagabhushan, “Dataset and ground truth for handwritten text in four different scripts”, International Journal of Pattern Recognition and Artificial Intelligence, vol. 26, no. 4, Article number 1253001, 2012
    15. A. Rehman and T. Saba, “Off-line cursive script recognition: current advances, comparisons and remaining problems”, Artificial Intelligence Review, vol. 37, no. 4, pp 261-288, 2012
    16. R. Saabni and J. El-Sana, “Language-Independent Text Lines Extraction Using Seam Carving”, 11th International Conference on Document Analysis and Recognition (ICDAR'11), pp. 563-568, Beijing, China, September 2011
    17. F. Simistira, V. Papavassiliou, T. Stafylakis and V. Katsouros, “Enhancing Handwritten Word Segmentation by Employing Local Spatial Features”, 11th International Conference on Document Analysis and Recognition (ICDAR'11), pp. 1314-1318, Beijing, China, September 2011
    18. A. Alaei, P. Nagabhushan and U. Pal, “Piece-wise painting technique for line segmentation of unconstrained handwritten text: a specific study with Persian text documents”, Pattern Analysis & Applications, vol. 14, no. 4, pp. 381-394, 2011
    19. A. Sánchez, C.A.B. Mello, P.D. Suárez and A. Lopes, “Automatic line and word segmentation applied to densely line-skewed historical handwritten document images”, Integrated Computer-Aided Engineering, vol. 18, no. 2, pp. 125-142, 2011
    20. A. Rehman and T. Saba, “Performance analysis of character segmentation approach for cursive script recognition on benchmark database”, Digital Signal Processing, vol. 21, no. 3, pp. 486-490, 2011
    21. E. Kavallieratou and F. Daskas, “Text Line Detection and Segmentation: Uneven Skew Angles and Hill-and-Dale Writing”, Journal of Universal Computer Science, vol. 17, no. 1, 2011, pp. 16-29, 2011
    22. A. Alaei, U. Pal and P. Nagabhushan, “A new scheme for unconstrained handwritten text-line segmentation”, Pattern Recognition vol. 44, no. 4, pp. 917-928, 2011
    23. E. Kavallieratou, "Text line detection and segmentation: uneven skew angles and hill-and-dale writing", ACM Symposium on Applied Computing (SAC'10), Sierre, Switzerland, pp. 59-60, 2010
    24. V. Papavassiliou, V. Katsouros and G. Carayannis, “A Morphological Approach for Text-Line Segmentation in Handwritten Documents”, 12th International Conference on Frontiers in Handwriting Recognition (ICFHR'10), pp. 19-24, Kolkata, India, November 2010
    25. H.I. Koo and N.I. Cho, “State Estimation in a Document Image and Its Application in Text Block Identification and Text Line Extraction”, 11th European conference on Computer vision (ECCV'10), pp. 421-434, Heraklion, Crete, Greece, September 2010
    26. P. Nagabhushan and A. Alaei, "Tracing and Straightening the Baseline in Handwritten Persian/Arabic Text-line: A New Approach Based on Painting-technique", International Journal on Computer Science and Engineering (IJCSE'10), vol. 2, no. 04, pp. 907-916, 2010
    27. R. Doumat, E.E. Zsigmond and J.M. Pinon, “User Trace-Based Recommendation System for a Digital Archive”, 8th International Conference on Case-Based Reasoning (ICCBR'10), pp. 360-374 , Alessandria, Italy, 2010
    28. N. Ouwayed, A. Belaïd and F. Auger, “General text line extraction approach based on locally orientation estimation”, 17th Document Recognition and Retrieval Conference (DDR'10), San Jose, CA, United States, pp. 1-10, 2010
    29. V. Papavassiliou, T. Stafylakis, V. Katsouros and G. Carayiannis, “Handwritten document image segmentation into text lines and words”, Pattern Recognition Journal, vol. 43, no. 1, pp. 369-377, 2010
    30. F. Kurniawan and D. Mohamad, “Performance Comparison between Contour-Based and Enhanced Heuristic-Based for Character Segmentation”, 5th International Conference on Signal-Image Technology & Internet-Based Systems (SITIS'09), Marrakesh, Morocco, pp. 112-117, 2009
    31. A. Khandelwal, P. Choudhury, R. Sarkar, S. Basu, M. Nasipuri and N. Das, “Text Line Segmentation for Unconstrained Handwritten Document Images Using Neighborhood Connected Component Analysis”, 3rd International Conference on Pattern Recognition and Machine Intelligence (PreMI'09), pp. 369-374, New Delhi, India, December 2009
    32. S.S. Bukhari, F. Shafait and T.M. Breuel, “Script-Independent Handwritten Textlines Segmentation using Active Contours”, 10th International Conference on Document Analysis and Recognition (ICDAR'09), pp. 446-450, Barcelona, Spain, July 2009
    33. S.S. Bukhari, F. Shafait and T.M. Breuel, “Coupled Snakelet Model for Curled Textline Segmentation of Camera-Captured Document Images”, 10th International Conference on Document Analysis and Recognition (ICDAR'09), pp. 61-65, Barcelona, Spain, July 2009
    34. E. Saund, J. Lin and P. Sarkar, “PixLabeler: User Interface for Pixel-Level Labeling of Elements in Document Images”, 10th International Conference on Document Analysis and Recognition (ICDAR'09), pp. 646-650, Barcelona, Spain, July 2009
    35. R.P. Santos, G.S. Clemente, T.I. Ren and G.D. Cavalcanti, “Text Line Segmentation Based on Morphology and Histogram Projection”, 10th International Conference on Document Analysis and Recognition (ICDAR'09), pp. 651-655, Barcelona, Spain, July 2009
    36. A. Rehman, D. Mohamad, F. Kurniawan and M. Ilays, “Performance analysis of segmentation approach for cursive handwriting on benchmark database”, International Conference on Computer Systems and Applications (AICCSA'09), pp. 265-270, Rabat, Morocco, May 2009
    37. R. Doumat, E. Egyed-Zsigmond and J.M. Pinon, “Digitized ancient documents...What's next?”, Document Numerique, vol. 12, no. 1, pp. 31-51, 2009
    38. T. Stafylakis, V. Papavassiliou, V. Katsouros and G. Carayiannis, “Robust text-line and word segmentation for handwritten documents images”, International Conference on Acoustics, Speech and Signal Processing, pp. 3393–3396, Las Vegas, USA, 2008
    39. S.S. Bukhari, F. Shafait and T.M. Breuel, “Segmentation of Curled Textlines Using Active Contours”, 8th International Workshop on Document Analysis Systems (DAS'08), pp. 270-277, Nara, Japan, September 2008
    40. R. Doumat, E.E. Zsigmond, J.M. Pinon and E. Csiszar, “Online ancient documents: Armarius”, 8th ACM symposium on Document engineering (DocEng'08), pp. 127-130, Sao Paulo, Brazil, 2008

  • G. Vamvakas, B. Gatos, S. Petridis and N. Stamatopoulos, “An Efficient Feature Extraction and Dimensionality Reduction Scheme for Isolated Greek Handwritten Character Recognition”, 9th International Conference on Document Analysis and Recognition (ICDAR'07), pp.1073-1077, Curitiba, Brazil, September 2007.Download Paper

  • In this paper, we present an off-line methodology for isolated Greek handwritten character recognition based on efficient feature extraction followed by a suitable feature vector dimensionality reduction scheme. Extracted features are based on (i) horizontal and vertical zones, (ii) the projections of the character profiles, (iii) distances from the character boundaries and (iv) profiles from the character edges. The combination of these types of features leads to a 325- dimensional feature vector. At a next step, a dimensionality reduction technique is applied, according to which the dimension of the feature space is lowered down to comprise only the features pertinent to the discrimination of characters into the given set of letters. In this paper, we also present a new Greek handwritten database of 36,960 characters that we created in order to measure the performance of the proposed methodology.
    1. R. Latypov R and E. Stolov, “A New Method for Slant Calculation in Off-Line Handwriting Analysis”, 41st International Conference on Telecommunications and Signal Processing (TSP'18), 2018
    2. M. Yağanoğlu and C. Köse, “Wearable Vibration Based Computer Interaction and Communication System for Deaf”, Applied Science, vol. 7, no. 12, 2017
    3. V.L. Padmalatha and M. Sampoorna, “Optimized Voronoi Image Zoning for Handwritten Character Recognition based on Kohonen Neural Networks”, International Journal of Engineering Applied Sciences and Technology, vol. 2, no. 2, pp. 76-80, 2016
    4. R. Hussain, A. Raza, I. Siddiqi, K. Khurshid and C. Djeddi , “A comprehensive survey of handwritten document benchmarks: structure, usage and evaluation”, IEURASIP Journal on Image and Video Processing, DOI 10.1186/s13640-015-0102-5, 2015
    5. L.P. Saxena, “A correlation coefficient based model to separate and classify noncursive (Grantha script) symbols”, International Journal on Electrical Engineering and Informatics, vol. 7, no. 3, pp. 531-540, 2015
    6. D. KumarVerma and A. Khatri, “An Improvement Study for Optical Character Recognition by using Inverse–SVM in Image Processing Technique”, International Journal of Advanced Research in Education Technology (IJARET), vol. 2, no. 2, pp. 101-105, 2015
    7. S.P. Patil and M.P.P. Kulkarni, “Online Handwritten Sanskrit Character Recognition Using Support Vector Classification”, International Journal of Engineering Research and Applications, vol. 4, no. 5, pp. 82-91, 2015
    8. P. Kulkarni, S. Patil and G. Dhanokar, “Review On Marathi And Sanskrit Word Recognition Using Genetic Algorithm”, International Journal of Informative & Futuristic Research, vol. 2, no. 7, pp. 2144-2152, 2015
    9. R. Kaur and S. Gujral, “Recognition of similar shaped isolated handwritten Gurumukhi characters using machine learning”, 5th International Conference on Confluence The Next Generation Information Technology Summit;, India, pp. 251-256, 2014
    10. M. Kumar, M.K. Jindal and R.K. Sharma, “A Novel Hierarchical Technique for Offline Handwritten Gurmukhi Character Recognition”, National Academy Science Letters, vol. 37, no. 6, pp. 567-572, 2014
    11. R. Kaur and S. Gujral, “Recognition of Similar Shaped Isolated Gurumukhi Characters Using ML Algorithms”, 2nd International Conference on Computer and Intelligent Systems (ICCIS’14), Bangkok, Thailand, pp. 41-46, 2014
    12. S. Panwar and N. Nain, “An Efficient Feature Extraction Method for Segmented Cursive Characters Recognition”, International Convention on Information and Communication Technology, Electronics and Microelectronic (MIPRO'14), Adriatic Coast, Croatia, pp. 1153-1158, 2014
    13. D. Impedovo and G. Pirlo, “Zoning Methods for Handwritten Character Recognition: A Survey”, Patern Recognition, vol. 47, no. 3, pp. 969-981, 2013
    14. H. Bobade and A. Sahu, “Character Recognition Technique using Neural Network”, International Journal of Engineering Research and Applications (IJERA), vol. 3, no. 2, pp. 1778-1783, 2013
    15. O.P. Sharma, M.K. Ghose, K.B. Shah and B.K. Thakur, “Recent Trends and Tools for Feature Extraction in OCR Technology”, International Journal of Soft Computing & Engineering, vol. 2, no. 6, pp. 220-223, 2013
    16. S.A. Vaidya and B.R. Bombade, “A Novel Approach of Handwritten Character Recognition using Positional Feature Extraction”, International Journal of Computer Science and Mobile Computing, vol. 2, no. 6, pp. 179-186, 2013
    17. G. Pirlo G. and D. Impedovo, “Adaptive membership functions for handwritten character recognition by Voronoi-based image zoning”, IEEE Transactions on Image Processing, vol. 21, no. 9, pp. 3827-3837, 2012
    18. K.S. Siddharth, M. Jangid, R. Dhir and R. Rani, “Handwritten Gurmukhi Character Recognition Using Statistical and Background Directional Distribution”, International Journal on Computer Science and Engineering, vol. 3, no. 6, pp. 2332-2345, 2011
    19. S. Dabra, S. Agrawal and R.K. Challa, “A novel feature set for recognition of similar shaped handwritten Hindi characters using machine learning”, 1st International Conference on Computer Science, Engineering and Applications (CCSEA'11), pp. 25-35, Chennai, India, July 2011

  • G. Vamvakas, N. Stamatopoulos, B Gatos, I. Pratikakis and S.J. Perantonis, “Greek Handwritten Character Recognition”, 11th Panhellenic Conference on Informatics (PCI'07), pp. 343-352, Patras, Greece, May 2007.Download Paper

  • In this paper, we present a database and methods for off-line isolated Greek handwritten character recognition. The Computational Intelligence Laboratory (CIL) Database consists of 35,000 isolated and labelled Greek handwritten characters. This database was tested with an existing structural approach for Greek handwritten characters as well as with a novel approach based on a hybrid feature extraction scheme. According to this approach, two types of features are combined in a hybrid fashion. The first one divides the character image into a set of zones and calculates the density of the character pixels in each zone. In the second type of features, the area that is formed from the projections of the upper and lower as well as of the left and right character profiles is calculated. For the classification step, Support Vectors Machines (SVM) and Euclidean Minimum Distance Classifier (EMDC) are used.

  • G. Vamvakas, B. Gatos, I. Pratikakis, N. Stamatopoulos, A. Roniotis and S.J. Perantonis, “Hybrid Off-Line OCR for Isolated Handwritten Greek Characters”, 4th IASTED International Conference on Signal Processing, Pattern Recognition, and Applications (SPPRA'07), pp. 197-202, Innsbruck, Austria, February 2007.Download Paper

  • In this paper, we present an off-line OCR methodology for isolated handwritten Greek characters mainly based on a robust hybrid feature extraction scheme. First, image pre-processing is performed in order to normalize the character images as well as to correct character slant. At the next step, two types of features are combined in a hybrid fashion. The first one divides the character image into a set of zones and calculates the density of the character pixels in each zone. In the second type of features, the area that is formed from the projections of the upper and lower as well as of the left and right character profiles is calculated. For the classification step Support Vectors Machines (SVM) are used. The performance of the proposed methodology is demonstrated after testing with the CIL database (handwritten Greek character database), which was created from 100 different writers.
    1. S. Fujino, T. Hasegawa, M. Ueno, N. Mori and K. Matsumoto, “The Convolutional Neural Network Model Based on an Evolutionary Approach For Interactive Picture Book”, 20th Asia PacificSymposium - Intelligent and Evolutionary Systems (IES'16), pp. 103-106, Canberra, Australia, 2016
    2. G. Pagare and K. Verma, “Associative Memory Model for Distorted On-Line Devanagari Character Recognition”, 5th International Conference on Advances in Computing and Communications (ICACC'15), pp. 46-49, Kerala, India 2015
    3. T.V. Thach, N.H. Phi and H. Trang, “Isolated Vietnamese handwriting recognition embedded system applied combined feature extraction method”, 8th International Conference on Advanced Technologies for Communications (ATC'15), Viet Nam, 2015
    4. M. Ueno, K. Fukuda, A. Yasui, N. Mori, and K. Matsumoto, “Casook: Creative animating sketchbook”, Patern Recognition, 12th International Symposium on Distributed Computing and Artificial Intelligence (DCAI'15), Spain, 2015
    5. D. Impedovo and G. Pirlo, “Zoning Methods for Handwritten Character Recognition: A Survey”, Patern Recognition, vol. 47, no. 3, pp. 969-981, 2014
    6. H. Pham-Van, H.T. Nguyen and S.J. Wu, “Vietnamese handwriting recognition for automatic data entry in enrollment forms”, 2nd International Conference on Information Technology and Electronic Commerce (ICITEC'14), pp. 141-145, 2014
    7. D. Impedovo, F.M. Mangini and G. Pirlo, “A new adaptive zoning technique for handwritten digit recognition”, 17th International Conference on Image Analysis and Processing (ICIAP'13), pp. 91-100, Naples, Italy, September 2013
    8. O.P. Sharma, M.K. Ghose, K.B. Shah and B.K. Thakur, “ Recent Trends and Tools for Feature Extraction in OCR Technology”, International Journal of Soft Computing & Engineering, vol. 2, no. 6, pp. 220-223, 2013
    9. A. Rehman and T. Saba, “Off-line cursive script recognition: current advances, comparisons and remaining problems”, Artificial Intelligence Review, vol. 37, no. 4, pp. 261-288, 2012
    10. T. Saba, A. Rehman, and G. Sulong, “Off-line cursive script recognition: current advances, comparisons and remaining problems”, International Journal of Innovative Computing, Information and Control, vol.7, no. 9, pp. 5211-5224, 2011
    11. G. Paliouras, C.D. Spyropoulos and G. Tsatsaronis, “Bootstrapping Ontology Evolution with Multimedia Information Extraction”, Multimedia Information Extraction, LNAI 6050, pp. 1–17, 2011
    12. H. Hamdi and M. Khemakhem, “Distributing Arabic Handwriting Recognition System Based on the Combination of Grid Meta-Scheduling and P2P Technologies (Omnivore)”, Universal Journal of Computer Science and Engineering Technology, vol. 1, no 1, pp. 31 - 35, 2010
    13. P.A. Phuong, N.Q. Tao and L.C. Mai, “An Efficient Model for Isolated Vietnamese Handwritten Recognition”, International Conference on Intelligent Information Hiding and Multimedia Signal Processing (IIH-MSP'08), pp. 358-361, 2008

Contact

NCSR "Demokritos"
153 10 Aghia Paraskevi, Athens, Greece

Call me : +30 210 650 3141

E-mail : nstam@iit.demokritos.gr