Deaf and hard-of-hearing (DHH) viewers rely on captioning to perceive auditory information while watching live television programming. Captioning ensures that viewers can receive textual and non-textual information that is crucial to comprehend the content of the video. However, no research has investigated how non-textual properties of captioning, such as caption placement and the presentation of current speakers, influence DHH viewers’ ability to comprehend video content and their judgement of a captioned video’s quality. Thus, this work aims to understand the effect of non-textual properties of captioning on DHH viewers’ live captioned video watching experiences; these findings can inform a better design of existing live captioning guidelines and caption quality evaluation methods to benefit DHH viewers who employ captioning while watching live TV programming. This dissertation offers the following three major contributions: 1. Understanding Perspective of DHH Users' while Watching Live Captioned TV Programming. While captioning is a long-standing medium of disseminating auditory information, live captioning is relatively newer phenomenon. It is still unknown how current practices related to captioning properties and appearance affect DHH viewers’ ability to comprehend the content of TV programming. Furthermore, it is also unknown what challenges DHH viewers encounter with traditional captioning and what methods of presenting captions they prefer in a live context. Thus, the first part of this dissertation focuses on investigating the perspectives of DHH viewers regarding caption appearance in live TV programming. 2. The Creation and Evaluation of Occlusion Severity-Based Caption Evaluation Methods. Prior literature has identified challenges encountered by DHH viewers when captions overlap some of the visual information on the screen. However, no research has quantified how much information captions occlude and which information affects DHH viewers’ subjective assessments of the quality of captioned videos. In light of this, in this part, we present research that aims to identify the information regions that are crucial to DHH viewers. Based on users’ subjective priority on these information regions, we developed and evaluated metrics that can estimate the quality of captioning when captions occlude various information regions from the perspective of DHH viewers. 3. The Effect of Speaker Identifier Methods on DHH Viewers’ Subjective Assessments of a Captioned Video’s Quality. This part explores another facet of caption attributes, namely the ability to identify the current speaker on the screen. Prior literature has proposed several advanced speaker identification methods that somewhat improved viewers’ ability to comprehend who is speaking. However, due to the challenges involved in live captioning, including captioners’ time and resource limitations, several in-text speaker identification methods are more commonly used live contexts, yet no prior research has evaluated how these methods would perform when multiple people appear on the screen. Therefore, we investigate whether DHH viewers’ preferences regarding speaker identification methods change in the presence of various numbers of onscreen speakers.

Library of Congress Subject Headings

Real-time closed captioning--Public opinion; Real-time closed captioning--Quality control; Deaf--Attitudes; Hearing impaired--Attitudes; Eye tracking; User interfaces (Computer systems)--Design

Publication Date


Document Type


Student Type


Degree Name

Computing and Information Sciences (Ph.D.)

Department, Program, or Center

Computer Science (GCCIS)


Matt Huenerfauth

Advisor/Committee Member

Raja Kushalnagar

Advisor/Committee Member

Kristen Shinohara


RIT – Main Campus

Plan Codes