Abstract
Activity recognition is an active field of research with many applications for both industrial and home use. Industry might use it as part of a security surveillance system, while home uses could be in applications such as smart rooms and aids for the disabled. This thesis develops one component of a “smart system” that can recognize certain activities related to the subject’s intent, i.e. where subjects concentrate their attention. A visual intent activity recognition system that operates in near real-time is created, based on multiple cameras. To accomplish this, a combination of face detection, facial feature detection, and pose estimation is used to estimate each subject’s gaze direction. To allow for better detection of the subject’s facial features, and thus more robust pose estimation, a multiple camera system is used. A wide-view camera is zoomed out and finds the subject, while a narrow-view camera zooms in to get more details on the face. Neural networks are then used to locate the mouth and eyes. A triangle template is matched to these features and used to estimate the subject’s pose in real-time. This method is used to determine where the subjects are looking and detect the activity of looking intently at a given location. A four-camera system recognizes the activity as occurring when at least one of two subjects is looking at the other. Testing showed that, on average, the pose estimate was accurate to within 5.08 degrees. The visual intent activity recognition system was able to correctly determine when one subject was looking at the other over 95% of the time.
Library of Congress Subject Headings
Gaze--Data processing; Eye--Movements--Data processing; Computer vision; Pattern recognition systems; Human face recognition (Computer science); Neural networks (Computer science)
Publication Date
12-1-2006
Document Type
Thesis
Department, Program, or Center
Computer Engineering (KGCOE)
Advisor
Savakis, Andreas - Chair
Advisor/Committee Member
Cockburn, Juan
Advisor/Committee Member
Hu, Fei
Recommended Citation
Erhard, Matthew, "Visual intent recognition in a multiple camera environment" (2006). Thesis. Rochester Institute of Technology. Accessed from
https://repository.rit.edu/theses/5502
Campus
RIT – Main Campus
Comments
Note: imported from RIT’s Digital Media Library running on DSpace to RIT Scholar Works. Physical copy available through RIT's The Wallace Library at: QP491 .E74 2006