Abstract
In this thesis, we propose a different technique to initialize a Convolutional K-means. We propose Visual Similarity Sampling (VSS) to collect $8\times8$ sample patches from images for convolutional feature learning. The algorithm uses within-class and between-class cosine similarity/dissimilarity measure to collect samples from both foreground and background. Thus. VSS uses local frequency of shapes within a character patch and uses it as probability distribution to select them. Also, we show how that initializing Convolutional K-means from samples with high between-class and within-class similarity produce discriminative codebook. We test the codebook to detect text in the natural scene. We show that using representative property within and between class for each sample as the probability for selecting it as initial cluster center, helps achieve discriminative cluster centers, which we use as feature maps. One of the advantages of our work is; since it is not problem dependent, it can be applied for sample collection in other pattern recognition problems. The proposed algorithm helped improve detection rate and simplify the learning process in both convolutional feature learning and text detection training.
Library of Congress Subject Headings
Optical character recognition; Machine learning; Pattern recognition systems
Publication Date
8-2017
Document Type
Thesis
Student Type
Graduate
Degree Name
Computer Science (MS)
Department, Program, or Center
Computer Science (GCCIS)
Advisor
Richard Zanibbi
Advisor/Committee Member
Leo Reznik
Advisor/Committee Member
Matthew Fluet
Recommended Citation
Aziz, Kardo Othman, "Better Text Detection through Improved K-means-based Feature Learning" (2017). Thesis. Rochester Institute of Technology. Accessed from
https://repository.rit.edu/theses/9512
Campus
RIT – Main Campus
Plan Codes
COMPSCI-MS
Comments
Physical copy available from RIT's Wallace Library at TA1640 .A94 2017