Abstract

The usage of deep learning algorithms has resulted in significant progress in auto- matic speech recognition (ASR). The ASR models may require over a thousand hours of speech data to accurately recognize the speech. There have been case studies that have indicated that there are certain factors like noise, acoustic distorting conditions, and voice quality that has affected the performance of speech recognition. In this research, we investigate the impact of noise on Automatic Speech Recognition and explore novel methods for developing noise-robust ASR models using the Tamil lan- guage dataset with limited resources. We are using the speech dataset provided by SpeechOcean.com and Microsoft for the Indian languages. We add several kinds of noise to the dataset and find out how these noises impact the ASR performance. We also determine whether certain data augmentation methods like raw data augmen- tation and spectrogram augmentation (SpecAugment) are better suited to different types of noises. Our results show that all noises, regardless of the type, had an impact on ASR performance, and upgrading the architecture alone were unable to mitigate the impact of noise. Raw data augmentation enhances ASR performance on both clean data and noise-mixed data, however, this was not the case with SpecAugment on the same test sets. As a result, raw data augmentation performs way better than SpecAugment over the baseline models.

Library of Congress Subject Headings

Automatic speech recognition; Automatic speech recognition--Technological innovations; Noise

Publication Date

4-2022

Document Type

Thesis

Student Type

Graduate

Degree Name

Computer Engineering (MS)

Department, Program, or Center

Computer Engineering (KGCOE)

Advisor

Emily Prud'hommeaux

Advisor/Committee Member

Alexander Loui

Advisor/Committee Member

Andres Kwasinski

Recommended Citation

Lakshminarayanan, Vigneshwar, "Impact of Noise in Automatic Speech Recognition for Low-Resourced Languages" (2022). Thesis. Rochester Institute of Technology. Accessed from
https://repository.rit.edu/theses/11113

Campus

RIT – Main Campus

Plan Codes

CMPE-MS

Download

COinS

Theses

Impact of Noise in Automatic Speech Recognition for Low-Resourced Languages

Abstract

Library of Congress Subject Headings

Publication Date

Document Type

Student Type

Degree Name

Department, Program, or Center

Advisor

Advisor/Committee Member

Advisor/Committee Member

Recommended Citation

Campus

Plan Codes

Search

Browse

Author Corner

RIT Links

Theses

Impact of Noise in Automatic Speech Recognition for Low-Resourced Languages

Author

Abstract

Library of Congress Subject Headings

Publication Date

Document Type

Student Type

Degree Name

Department, Program, or Center

Advisor

Advisor/Committee Member

Advisor/Committee Member

Recommended Citation

Campus

Plan Codes

Share

Search

Browse

Author Corner

RIT Links