Abstract
The deep neural network is an intriguing prognostic model capable of learning meaningful patterns that generalize to new data. The deep learning paradigm has been widely adopted across many domains, including for natural language processing, genomics, and automatic music transcription. However, deep neural networks rely on a plethora of underlying computational units and data, collectively demanding a wealth of compute and memory resources for practical tasks. This model complexity prohibits the use of larger deep neural networks for resource-critical applications, such as edge computing. In order to reduce model complexity, several research groups are actively studying compression methods, hardware accelerators, and alternative computing paradigms. These orthogonal research explorations often leave a gap in understanding the interplay of the optimization mechanisms and their overall feasibility for a given task.
In this thesis, we address this gap by developing a holistic solution to assess the model complexity reduction theoretically and quantitatively at both high-level and low-level abstractions for training and inference. At the algorithmic level, a novel deep, yet lightweight, recurrent architecture is proposed that extends the conventional echo state network. The architecture employs random dynamics, brain-inspired plasticity mechanisms, tensor decomposition, and hierarchy as the key features to enrich learning. Furthermore, the hyperparameter landscape is optimized via a particle swarm optimization algorithm. To deploy these networks efficiently onto low-end edge devices, both ultra-low and mixed-precision numerical formats are studied within our feedforward deep neural network hardware accelerator. More importantly, the tapered-precision posit format with a novel exact-dot-product algorithm is employed in the low-level digital architectures to study its efficacy in resource utilization.
The dynamics of the architecture are characterized through neuronal partitioning and Lyapunov stability, and we show that superlative networks emerge beyond the "edge of chaos" with an agglomeration of weak learners. We also demonstrate that tensorization improves model performance by preserving correlations present in multi-way structures. Low-precision posits are found to consistently outperform other formats on various image classification tasks and, in conjunction with compression, we achieve magnitudes of speedup and memory savings for both training and inference for the forecasting of chaotic time series and polyphonic music tasks. This culmination of methods greatly improves the feasibility of deploying rich predictive models on edge devices.
Library of Congress Subject Headings
Neural networks (Computer science); Machine learning; Artificial intelligence--Forecasting; Stochastic analysis; Geometric quantization; Tensor fields
Publication Date
6-2019
Document Type
Thesis
Student Type
Graduate
Degree Name
Computer Engineering (MS)
Department, Program, or Center
Computer Engineering (KGCOE)
Advisor
Dhireesha Kudithipudi
Advisor/Committee Member
Cory Merkel
Advisor/Committee Member
Panos P. Markopoulos
Recommended Citation
Carmichael, Zachariah JL, "Towards Lightweight AI: Leveraging Stochasticity, Quantization, and Tensorization for Forecasting" (2019). Thesis. Rochester Institute of Technology. Accessed from
https://repository.rit.edu/theses/10144
Campus
RIT – Main Campus
Plan Codes
CMPE-MS
Comments
Recipient of the 2019 RIT Graduate Education Master’s Thesis Award