Animations of sign language can increase the accessibility of information for people who are deaf or hard of hearing (DHH), but prior work has demonstrated that accurate non-manual expressions (NMEs), consisting of face and head movements, are necessary to produce linguistically accurate animations that are easy to understand. When synthesizing animation, given a sequence of signs performed on the hands (and their timing), we must select an NME performance. Given a corpus of facial motion-capture recordings of ASL sentences with annotation of the timing of signs in the recording, we investigate methods (based on word count and on delexicalized sign timing) for selecting the best NME recoding to use as a basis for synthesizing a novel animation. By comparing recordings selected using these methods to a gold-standard recording, we identify the top-performing exemplar selection method for several NME categories.

Date of creation, presentation, or exhibit



Copyright 2016 INTERSPEECH. Presented at the 7th Workshop on Speech and Language Processing for Assistive Technologies, INTERSPEECH 2016, September 8-12, 2016, San Francisco, California.

Document Type

Conference Paper

Department, Program, or Center

Information Sciences and Technologies (GCCIS)


RIT – Main Campus