Human activity recognition is an emerging area of research in computer vision with applications in video surveillance, human-computer interaction, robotics, and video annotation. Despite a number of recent advances, there are still many opportunities for new developments, especially in the area of person-person and person-object interaction. Many proposed algorithms focus on recognizing solely single person, person-person or person-object activities. An algorithm which can recognize all three types would be a significant step toward the real-world application of this technology. This thesis investigates the design and implementation of such an algorithm. It utilizes background subtraction to extract the subjects in the scene, and pixel clustering to segment their image into body parts. A location-based feature identification algorithm extracts feature points from these segments and feeds them to a classifier which identifies videos as activities. Together these techniques comprise an algorithm that can recognize single person, person-person and person-object interactions. This algorithm's performance was evaluated based on interactions in a new video dataset, demonstrating the effectiveness of using limb-level feature points as a method of identifying human interactions.

Library of Congress Subject Headings

Computer vision; Motion perception (Vision)--Computer simulation; Human locomotion--Analysis; Kinesiology--Data processing; Classification--Data processing

Publication Date


Document Type


Department, Program, or Center

Computer Engineering (KGCOE)


Savakis, Andreas

Advisor/Committee Member

Cockburn, Juan

Advisor/Committee Member

Shaaban, Muhammad


Note: imported from RIT’s Digital Media Library running on DSpace to RIT Scholar Works. Physical copy available through RIT's The Wallace Library at: TA1650 .D84 2009


RIT – Main Campus