Abstract

Deep Neural Networks (DNNs) have shown remarkable success in speech denoising; however, their high computational and energy requirements make realtime deployment on edge devices challenging. In contrast, Spiking Neural Networks (SNNs) operate using sparse, event-driven spikes—offering a biologically inspired and energy-efficient alternative. In this study, we delve into a Spiking Neural Network (SNN) model that leverages the temporal dynamics of spiking neurons to capture long-range dependencies in the audio signal. By encoding the input audio into sparse and event-driven computations, the SNN can efficiently process the temporal information while requiring significantly fewer computations compared to DNNs. We present a real-time speech denoising system that maps noisy audio to sparse spike trains and processes them using diverse SNN architectures that aim to exploit time- and frequency-domain features. We investigated various baseline models and propose the Dual-Signal Transformation Spiking Network, a hybrid model that conducts frequency-domain enhancement through spectrogram masking and supplements it with raw waveform-based reconstruction in the time domain. Our experiments show that SNNs -- especially the Dual-Signal model -- are able to achieve competitive denoising performance while substantially lowering the computational cost—opening up possibilities for efficient and real-time auditory processing on neuromorphic hardware. This potentially contribute to the development of real time audio denoiser using SNN.

Library of Congress Subject Headings

Neural networks (Computer science); Computer architecture; Source separation (Signal processing)

Publication Date

1-22-2026

Document Type

Thesis

Student Type

Graduate

Degree Name

Computer Science (MS)

Department, Program, or Center

Computer Science, Department of

College

Golisano College of Computing and Information Sciences

Advisor

Alexander Ororbia

Advisor/Committee Member

Aaron Deever

Advisor/Committee Member

Eduardo Coelho De Lima

Recommended Citation

Fitwe, Kedar Sandip, "Spike-Based Architectures for Energy-Efficient Audio Enhancement" (2026). Thesis. Rochester Institute of Technology. Accessed from
https://repository.rit.edu/theses/12488

Campus

RIT – Main Campus

Plan Codes

COMPSCI-MS

Download

COinS

Theses

Spike-Based Architectures for Energy-Efficient Audio Enhancement

Abstract

Library of Congress Subject Headings

Publication Date

Document Type

Student Type

Degree Name

Department, Program, or Center

College

Advisor

Advisor/Committee Member

Advisor/Committee Member

Recommended Citation

Campus

Plan Codes

Search

Browse

Author Corner

RIT Links

Theses

Spike-Based Architectures for Energy-Efficient Audio Enhancement

Author

Abstract

Library of Congress Subject Headings

Publication Date

Document Type

Student Type

Degree Name

Department, Program, or Center

College

Advisor

Advisor/Committee Member

Advisor/Committee Member

Recommended Citation

Campus

Plan Codes

Share

Search

Browse

Author Corner

RIT Links