Abstract
Detecting text in images presents the unique challenge of finding both in-scene and superimposed text of various sizes, fonts, colors, and textures in complex backgrounds. The goal of this system is not to recognize specific letters or words but only to determine if a pixel is text or not. This pixel level decision is made by applying a set of weighted classifiers created using a set of high pass filters, and a series of image processing techniques. It is our assertion that the learned weighted combination of frequency filters in conjunction with image processing techniques may show better pixel level text detection performance in terms of precision, recall, and f-metric, than any of the components do individually. Qualitatively, our algorithm performs well and shows promising results. Quantitative numbers are not as high as is desired, but not unreasonable. For the complete ensemble, the f-metric was found to be 0.36.
Library of Congress Subject Headings
Video recordings--Data processing; Optical pattern recognition; Image analysis; Image processing--Digital techniques; CAPTCHA (Challenge-response test)--Data processing
Publication Date
5-1-2011
Document Type
Thesis
Department, Program, or Center
Chester F. Carlson Center for Imaging Science (COS)
Advisor
Zanibbi, Richard
Recommended Citation
Snyder, Dave, "Text detection in natural scenes through weighted majority voting of DCT high pass filters, line removal, and color consistency filtering" (2011). Thesis. Rochester Institute of Technology. Accessed from
https://repository.rit.edu/theses/3032
Campus
RIT – Main Campus
Comments
Note: imported from RIT’s Digital Media Library running on DSpace to RIT Scholar Works. Physical copy available through RIT's The Wallace Library at: TA1650 .S69 2011