A multicamera system for gesture tracking with three dimensional hand pose estimation

Evan Clark

Note: imported from RIT’s Digital Media Library running on DSpace to RIT Scholar Works. Physical copy available through RIT's The Wallace Library at: TA1634 .C52 2006


The goal of any visual tracking system is to successfully detect then follow an object of interest through a sequence of images. The difficulty of tracking an object depends on the dynamics, the motion and the characteristics of the object as well as on the environment. For example, tracking an articulated, self-occluding object such as a signing hand has proven to be a very difficult problem. The focus of this work is on tracking and pose estimation with applications to hand gesture interpretation. An approach that attempts to integrate the simplicity of a region tracker with single hand 3D pose estimation methods is presented. Additionally, this work delves into the pose estimation problem. This is accomplished by both analyzing hand templates composed of their morphological skeleton, and addressing the skeleton’s inherent instability. Ligature points along the skeleton are flagged in order to determine their effect on skeletal instabilities. Tested on real data, the analysis finds the flagging of ligature points to proportionally increase the match strength of high similarity image-template pairs by about 6%. The effectiveness of this approach is further demonstrated in a real-time multicamera hand tracking system that tracks hand gestures through three-dimensional space as well as estimate the three-dimensional pose of the hand.