Abstract
Support Vector Machines (SVMs) have demonstrated accuracy and efficiency in a variety of binary classification applications including indoor/outdoor scene categorization of consumer photographs and distinguishing unsolicited commercial electronic mail from legitimate personal communications. This thesis examines a parallel implementation of the Sequential Minimal Optimization (SMO) method of training SVMs resulting in multiprocessor speedup subject to a decrease in accuracy dependent on the data distribution and number of processors. Subsequently the SVM classification system was applied to the image labeling and e-mail classification problems. A parallel implementation of the image classification system's color histogram, color coherence, and edge histogram feature extractors increased performance when using both noncaching and caching data distribution methods. The electronic mail classification application produced an accuracy of 96.69% with a user-generated dictionary. An implementation of the electronic mail classifier as a Microsoft Outlook add-in provides immediate mail filtering capabilities to the average desktop user. While the parallel implementation of the SVM trainer was not supported for the classification applications, the parallel feature extractor improved image classification performance.
Library of Congress Subject Headings
Machine learning; Computer algorithms; Images, Photographic--Classification; Electronic mail messages--Classification
Publication Date
6-1-2002
Document Type
Thesis
Department, Program, or Center
Computer Engineering (KGCOE)
Advisor
Shaaban, Muhammad
Advisor/Committee Member
Savakis, Andeas
Advisor/Committee Member
Czernikowski, Roy
Recommended Citation
Woitaszek, Matthew, "Support vector machines for image and electronic mail classification" (2002). Thesis. Rochester Institute of Technology. Accessed from
https://repository.rit.edu/theses/5466
Campus
RIT – Main Campus
Comments
Note: imported from RIT’s Digital Media Library running on DSpace to RIT Scholar Works. Physical copy available through RIT's The Wallace Library at: QA76.9.A43 W648 2002