Abstract

Identifying unfamiliar human movements is crucial across several tasks and fields, including skill acquisition and human movement analysis research. With the advancements in computer vision technology, search systems can be developed that allow users to search for human movements by submitting a video sample. A clear example of such systems is search-by-video sign-language look-up systems. The advances in sign-language recognition technology have made it feasible to design these systems, which allow users to generate queries by either submitting a video of themselves performing the word or by submitting a video segment containing the unknown sign. These systems can be particularly useful for sign-language look-up as no standard labels can be used for searching for an unfamiliar sign. These proposed search systems also have significant advantages over current commercial systems that require users to select various linguistic properties of a sign to search for its meaning, which can be challenging for learners. However, even with improvements in sign-recognition, in an actual sign look-up system in the near future, it is unlikely that the user will only see one correct result. In practice, it is much more likely that the users will need to browse a list of possible ``matches" in a results list to find the word they seek. As artificial intelligence researchers are working on sign recognition technologies, Human-Computer Interaction (HCI) research is needed to investigate how to best structure the user experience of sign look-up systems for learners, given the imperfect nature of sign recognition. This dissertation presents original empirical investigations into the design of search-by-video American Sign Language (ASL) look-up systems. It presents and evaluates prototype systems and illustrates how HCI research can enhance the user experience of these technologies. The dissertation also explores how some of our findings could inform the design of other valuable structured human-movement recognition systems. This dissertation’s contributions are split into four parts: Part 1 comprises of three experimental studies aimed at investigating the impact of three sign-recognition performance variables on the usability of sign-language dictionaries. The results from these studies help in developing metrics that can predict the users’ satisfaction with the ASL dictionary's sign-recognition technology. Part 2 presents an interview study and an experimental study that investigate how to best structure the experience of search-by-video sign-language dictionary systems. Results from this study revealed that by just re-structuring the user experience, user satisfaction with search-by-video systems can be enhanced without increasing the performance of underlying sign recognition system. Part 3 includes an interview study, an observational study, and an experimental study, which examine the use of sign-recognition technology to look up unfamiliar signs while watching challenging ASL videos. The results indicate the usage and advantages of using video sub-span as an input to a dictionary search for ASL learners. These findings also motivate the investigation of sub-span-based search for other contexts. Part 4 involves an interview study and an observational study, which demonstrate how the span-based search approach can be used in other non-linguistic structured human-movement search contexts. Specifically, the studies explore the use of a span-search tool to look up unfamiliar dance moves while engaging in a script-writing task, demonstrating the versatility of the approach.

Library of Congress Subject Headings

Gesture recognition (Computer science); Search engines; Video recordings--Computer network resources; Human-computer interaction

Publication Date

5-2023

Document Type

Dissertation

Student Type

Graduate

Degree Name

Computing and Information Sciences (Ph.D.)

Department, Program, or Center

Computer Science (GCCIS)

Advisor

Matt Huenerfauth

Advisor/Committee Member

Kristen Shinohara

Advisor/Committee Member

Garreth Tigwell

Campus

RIT – Main Campus

Plan Codes

COMPIS-PHD

Share

COinS