Abstract
The demand to deploy deep learning models on edge devices has recently increased due to their pervasiveness, in applications ranging from healthcare to precision agriculture. However, a major challenge with current deep learning models, is their computational complexity. One approach to address this limitation is to compress the deep learning models by employing low-precision numerical formats. Such low-precision models often suffer from degraded inference or training accuracy. This lends itself to the question, which low-precision numerical format can meet the objective of high training accuracy with minimal resources? This research introduces tapered-precision numerical formats for deep learning inference and training. These formats have inherent capability to match the distribution of deep learning parameters by expressing values in unequal-magnitude spacing such that the density of values is maximum near zero and is tapered towards the maximum representable number. We develop low-precision arithmetic frameworks, that utilize tapered precision numerical formats to enhance the performance of deep learning inference and training. Further, we develop a software/hardware co-design framework to identify the right format for inference based on user-defined constraints through integer linear programming optimization. Third, novel adaptive low-precision algorithms are proposed that match the tapered-precision numerical format configuration to best represent the layerwise dynamic range and distribution of parameters within a deep learning model. Finally, a numerical analysis approach and signal-to-quantization-noise ratio equation for tapered-precision numerical formats are proposed that uses a metric to select the appropriate numerical format configuration. The efficacy of the proposed approaches is demonstrated on various benchmarks. Results assert that the accuracy and hardware cost trade-off of low-precision deep neural networks using tapered precision numerical formats outperform other well-known numerical formats, including floating point and fixed-point.
Library of Congress Subject Headings
Deep learning (Machine learning); Learning models (Stochastic processes)
Publication Date
12-2023
Document Type
Dissertation
Student Type
Graduate
Degree Name
Electrical and Computer Engineering (Ph.D)
Department, Program, or Center
Electrical and Computer Engineering Technology
College
Kate Gleason College of Engineering
Advisor
Dhireesha Kudithipudi
Advisor/Committee Member
Andres Kwasinski
Advisor/Committee Member
Majid Rabbani
Recommended Citation
Fatemi Langroudi, Seyed Hamed, "Tapered-Precision Numerical Formats for Deep Learning Inference and Training" (2023). Thesis. Rochester Institute of Technology. Accessed from
https://repository.rit.edu/theses/11621
Campus
RIT – Main Campus
Plan Codes
ECE-PHD