DIA Research

horizontal rule

Home
Short CV
Publications
DIA Research
Projects
Teaching
Links
Demos
News

Document Image Analysis (DIA) Research:

bulletDocument Image Enhancement: Adaptive  binarization and enhancement of degraded documents, remove degradations due to shadows, non-uniform illumination, low contrast and smear [ref ], recovery of arbitrarily warped documents [ref ], skew correction [ref ], noisy border removal [ref ], image up-sampling and post-processing in order to improve the quality of text regions and preserve stroke connectivity.    

bullet

Typewritten OCR: Extracting curvature features from characters [ref ],  binary tree based OCR [ref ], multi-classifier OCR [ref ], Neural Network, SVM  & KNN classifiers for typewritten OCR [ref ], text line position determination [ref ], word segmentation [ref] , segmentation, recognition and article tracking for old newspapers [ref ].

bullet

Handwritten OCR: Handwritten character representation, segmentation-free OCR [ref ], feature extraction based on the skeletonized character body, topological description of the character skeleton for old Greek handwritten manuscript recognition [ref ], recognition of isolated Handwritten Greek characters [ref ], cursive handwritten word recognition [ref ], text line detection in handwritten documents [ref ], database of Greek handwritten characters [ref ].

 

bullet

Word spotting: Keyword search in historical typewritten documents, word retrieval optimized by user's feedback .

 

   
bullet

Page Segmentation: Automatic extraction of the main document image components (text, titles, images, captions, graphics, lines, special symbols), segmentation area location using isothetic polygons, newspaper page segmentation into specific items (main titles, head-lines, over-titles, sub-titles, references), article identification and reconstruction. Doc1, Doc2, Doc3

bullet

Page Segmentation & Handwriting Segmentation Evaluation : Evaluation of page segmentation and region classification sub-systems, ground-truth maker for segmentation result annotation, co-organization of the:

ICDAR2007 Page Segmentation Competition Doc1,

ICDAR2007 Handwriting Segmentation Contest Doc2

ICDAR2005 Page Segmentation Competition Doc3,

ICDAR2003 Page Segmentation Competition Doc4

First International Newspaper Page Segmentation Contest (ICDAR2001) Doc5.

bullet

Line and Table Detection: Automatic table detection in document images, morphological operations in order to connect line breaks and to enhance line segments, horizontal and vertical line detection, image/text areas removal, detection of line intersections, table reconstruction. Doc1

bullet

Camera Based Document Analysis & Recognition: Text detection in indoor/outdoor scene images and video frames, efficient binarization and enhancement of camera images, connected component analysis in order to detect text regions, help camera images to be successfully processed and recognized by commercial OCR engines. Doc1, Doc2, Doc3, Doc4

bullet

Text Identification in Web images: Web image processing for text area identification, prepare Web images for OCR procedure with best results, conditional dilation technique in order to detect text and inverted text areas, process Web images of low resolution, consisting mainly of graphic objects and having the anti-aliasing property. Doc1, Doc2 

horizontal rule

Home | Short CV | Publications | DIA Research | Projects | Teaching | Links | Demos | News

 Last updated: 19-10-2007.