Gabor filters for Document analysis in Indian Bilingual Documents

Reasonable success has been achieved at developing mono lingual OCR systems in Indian scripts. Scientists, op-timistically, have started to look beyond. Development of bilingual OCR systems and OCR systems with capa-bility to identify the text areas are some of the pointers to future activities in I...

Full description

Bibliographic Details
Main Authors: Peeta Basa Pati, S Sabari Raju, Nishikanta Pati, A G Ramakrishnan
Other Authors: The Pennsylvania State University CiteSeerX Archives
Format: Text
Language:English
Published: 2004
Subjects:
Online Access:http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.132.5139
http://eprints.iisc.ernet.in/archive/00000386/01/gabor.pdf
Description
Summary:Reasonable success has been achieved at developing mono lingual OCR systems in Indian scripts. Scientists, op-timistically, have started to look beyond. Development of bilingual OCR systems and OCR systems with capa-bility to identify the text areas are some of the pointers to future activities in Indian scenario. The separation of text and non-text regions before considering the doc-ument image for OCR is an important task. In this paper, we present a biologically inspired, multi-channel filtering scheme for page layout analysis. The same scheme has been used for script recognition as well. Parameter tuning is mostly done heuristically. It has also been seen to be computationally viable for com-mercial OCR system development.