Abstract
Deep convolutional neural networks (CNNs) are effective and popularly used in a wide variety of computer vision tasks, especially in image classification. Conventionally, they consist of a series of convolutional and pooling layers followed by one or more fully connected (FC) layers to produce the final output in image classification tasks. This design descends from traditional image classification machine learning models which use engineered feature extractors followed by a classifier, before the widespread application of deep CNNs. While this has been successful, in models trained for classifying datasets with a large number of categories, the fully connected layers often account for a large percentage of the network's parameters. For applications with memory constraints, such as mobile devices and embedded platforms, this is not ideal. Recently, a family of architectures that involve replacing the learned fully connected output layer with a fixed layer has been proposed as a way to achieve better efficiency. This research examines this idea, extends it further and demonstrates that fixed classifiers offer no additional benefit compared to simply removing the output layer along with its parameters. It also reveals that the typical approach of having a fully connected final output layer is inefficient in terms of parameter count. This work shows that it is possible to remove the entire fully connected layers thus reducing the model size up to 75% in some scenarios, while only making a small sacrifice in terms of model classification accuracy. In most cases, this method can achieve comparable performance to a traditionally learned fully connected classification output layer on the ImageNet-1K, CIFAR-100, Stanford Cars-196, and Oxford Flowers-102 datasets, while not having a fully connected output layer at all. In addition to comparable performance, the method featured in this research also provides feature visualization of deep CNNs at no additional cost.
Library of Congress Subject Headings
Computer vision; Neural networks (Computer science); Convolutions (Mathematics); Image analysis; Classification--Data processing; Image processing--Digital techniques
Publication Date
4-27-2020
Document Type
Thesis
Student Type
Graduate
Degree Name
Imaging Science (MS)
Department, Program, or Center
Chester F. Carlson Center for Imaging Science (COS)
Advisor
Christopher Kanan
Advisor/Committee Member
Guoyu Lu
Advisor/Committee Member
Nathan Cahill
Recommended Citation
Qian, Zhongchao, "Deep Convolutional Networks without Learning the Classifier Layer" (2020). Thesis. Rochester Institute of Technology. Accessed from
https://repository.rit.edu/theses/10367
Campus
RIT – Main Campus
Plan Codes
IMGS-MS