Theses

Improving the Accessibility of Speech for Deaf and Hard-of-Hearing Individuals Through Affective Captions

Abstract

Captions have traditionally served as a bridge between the spoken word and its written representation, helping make speech accessible to Deaf and Hard-of-Hearing (DHH) individuals. It is worth considering, however, how much from speech is left out by this ’bridging’ between sound and visuals. This dissertation describes a research project that has, over six studies, looked at this very issue. We first examined whether there is an issue here at all. What does the experience of DHH individuals with captioning systems tell us about these systems’ shortcomings? For one, we found, captions are felt as monotonous and ambiguous. While communication is multimodal, and DHH individuals also use non-speech cues such as facial expressions or body language to disambiguate a speaker’s intended meanings, these channels are imperfect. Relying on conventional captioning systems is, at their worst, an alienating experience – a lot is lost in these audiovisual translations of speech, and what is lost matters. Study 2 looked into interventions to captioning systems that could close the gap between spoken words and text. Through various prototypes, it aimed at understanding what dimensions from speech would make captions most helpful: prosody, emotions, or a combination of both. Emotions, we found, were the best compromise between utility and legibility. Study 3 explored the design space of these ’affective’ captions. What typographic parameters can best be modulated to depict an emotion’s valence and arousal levels? The investigation looked both at subjective preferences and objective measures. We found that valence should be depicted through color. For arousal, either font-size or font-weight should be used, with the former preferred for videos with looser legibility requirements. Studies 4–6 constitute the final phase of this work, looking over how haptics can be shaped to convey a speaker’s arousal, and what are the consequences of doing so. Study 4 experimentally selected a wrist-worn vibrotactile mapping for arousal, identifying a single short pulse at 75 Hz (amplitude scaled to arousal) as the best compromise between comfort and discriminability. Study 5 then compared five captioning conditions on longer clips and found that a combined approach – valence via color plus arousal via both font-weight and haptics – significantly increased Narrative Engagement for DHH viewers over both a neutral baseline and a visuals-only affective style. Finally, Study 6 measured arousal-decoding accuracy on short clips and showed that adding haptic cues reliably reduced absolute error in perceived arousal, whereas visual weight did not yield a main effect. Together, these studies indicate that multimodal affective captions can be both more engaging and more informative than conventional captions. Taken as a whole, this dissertation demonstrates an approach for taking captions beyond verbatim transcription, incorporating affective dimensions of speech. Across six studies, we showed that affective captions are not only technically feasible but also valued by DHH viewers: they can increase engagement, clarify emotional nuance, and support decoding of subtle aspects of speech. By combining visual typography with haptic signals, we offer both conceptual and methodological advances toward more expressive and inclusive captioning systems. Beyond the specific designs and findings, the broader contribution is to reframe captioning as a fertile, multimodal design space capable of accommodating diverse communication needs.

Library of Congress Subject Headings

Deaf people--Means of communication; Hard of hearing people--Means of communication; Subtitles (Motion pictures, television, etc.); Closed captioning; Natural language processing (Computer science)

Publication Date

10-2025

Document Type

Dissertation

Student Type

Graduate

Degree Name

Computing and Information Sciences (Ph.D.)

Department, Program, or Center

Computing and Information Sciences Ph.D, Department of

College

Golisano College of Computing and Information Sciences

Advisor

Matt Huenerfauth

Advisor/Committee Member

Roshan L Peiris

Advisor/Committee Member

Kristen Shinohara

Recommended Citation

de Lacerda Pataca, Caluã, "Improving the Accessibility of Speech for Deaf and Hard-of-Hearing Individuals Through Affective Captions" (2025). Thesis. Rochester Institute of Technology. Accessed from
https://repository.rit.edu/theses/12338

Campus

RIT – Main Campus

Plan Codes

COMPIS-PHD

Download

COinS

Theses

Improving the Accessibility of Speech for Deaf and Hard-of-Hearing Individuals Through Affective Captions

Abstract

Library of Congress Subject Headings

Publication Date

Document Type

Student Type

Degree Name

Department, Program, or Center

College

Advisor

Advisor/Committee Member

Advisor/Committee Member

Recommended Citation

Campus

Plan Codes

Search

Browse

Author Corner

RIT Links

Theses

Improving the Accessibility of Speech for Deaf and Hard-of-Hearing Individuals Through Affective Captions

Author

Abstract

Library of Congress Subject Headings

Publication Date

Document Type

Student Type

Degree Name

Department, Program, or Center

College

Advisor

Advisor/Committee Member

Advisor/Committee Member

Recommended Citation

Campus

Plan Codes

Share

Search

Browse

Author Corner

RIT Links