Abstract
Live speech transcription and captioning are important for the accessibility of deaf and hard of hearing individuals, especially in situations with no visible ASL translators. If live captioning is available at all, it is typically rendered in the style of closed captions on a display such as a phone screen or TV and away from the real conversation. This can potentially divide the focus of the viewer and detract from the experience. This paper proposes an investigation into an alternative, Augmented Reality driven approach to the display of these captions, using deep neural networks to compute, track and associate deep visual and speech descriptors in order to maintain captions as "speech bubbles" above the speaker.
Library of Congress Subject Headings
Real-time closed captioning--Technological innovations; Augmented reality; Neural networks (Computer science)
Publication Date
5-2020
Document Type
Thesis
Student Type
Graduate
Degree Name
Computer Science (MS)
Department, Program, or Center
Computer Science (GCCIS)
Advisor
Joe Geigel
Advisor/Committee Member
Zack Butler
Advisor/Committee Member
Thomas Kinsman
Recommended Citation
Bowald, Dylan, "AR Comic Chat" (2020). Thesis. Rochester Institute of Technology. Accessed from
https://repository.rit.edu/theses/10372
Campus
RIT – Main Campus
Plan Codes
COMPSCI-MS