Abstract
Deep neural networks train millions of parameters to achieve state-of-the-art performance on a wide foray of applications. However, finding a global minimum with gradient descent approaches leads to lengthy training times coupled with high computational resource requirements. To alleviate these concerns, the idea of fixed-random weights in deep neural networks is explored. More critically the goal is to maintain performance akin to fully trained models.
Metrics such as floating point operations per second and memory size are compared and contrasted for fixed-random and fully trained models. Additional analysis on downsized models that mimic the number of trained parameters of the fixed-random models, shows that fixed-random weights enable slightly higher performance. In a fixed-random convolutional model, ResNet achieves ∼57% image classification accuracy on CIFAR-10. In contrast, a DenseNet architecture, with only fixed-random filters in the convolutional layers, achieves ∼88% accuracy for the same task. DenseNet’s fully trained model achieves ∼96% accuracy, which highlights the importance of architectural choice for a high performing model.
To further understand the role of architectures, random projection networks trained using a least squares approximation learning rule are studied. In these networks, deep random projection layers and skipped connections are exploited as they are shown to boost the overall network performance. In several of the image classification experiments conducted, additional layers and skipped connectivity consistently outperform a baseline random projection network by 1% to 3%. To reduce the complexity of the models in general, a tensor decomposition technique, known as the Tensor-Train decomposition, is leveraged. The compression of the fully-connected hidden layer leads to a minimum ∼40x decrease in memory size at a slight cost in resource utilization. This research study helps to gain a better understanding of how random filters and weights can be utilized to obtain lighter models.
Library of Congress Subject Headings
Neural networks (Computer science)--Evaluation; Image analysis; Optical pattern recognition
Publication Date
6-2019
Document Type
Thesis
Student Type
Graduate
Degree Name
Computer Engineering (MS)
Department, Program, or Center
Computer Engineering (KGCOE)
Advisor
Dhireesha Kudithipudi
Advisor/Committee Member
Cory Merkel
Advisor/Committee Member
Raymond Ptucha
Recommended Citation
Syed, Humza, "Performance Analysis of Fixed-Random Weights in Artificial Neural Networks" (2019). Thesis. Rochester Institute of Technology. Accessed from
https://repository.rit.edu/theses/10187
Campus
RIT – Main Campus
Plan Codes
CMPE-MS