Abstract
In this work, we propose methods that advance four areas in the field of computer vision: dimensionality reduction, deep feature embeddings, visual domain adaptation, and deep neural network compression. We combine concepts from the fields of manifold geometry and deep learning to develop cutting edge methods in each of these areas. Each of the methods proposed in this work achieves state-of-the-art results in our experiments. We propose the Proxy Matrix Optimization (PMO) method for optimization over orthogonal matrix manifolds, such as the Grassmann manifold. This optimization technique is designed to be highly flexible enabling it to be leveraged in many situations where traditional manifold optimization methods cannot be used.
We first use PMO in the field of dimensionality reduction, where we propose an iterative optimization approach to Principal Component Analysis (PCA) in a framework called Proxy Matrix optimization based PCA (PM-PCA). We also demonstrate how PM-PCA can be used to solve the general $L_p$-PCA problem, a variant of PCA that uses arbitrary fractional norms, which can be more robust to outliers. We then present Cascaded Projection (CaP), a method which uses tensor compression based on PMO, to reduce the number of filters in deep neural networks. This, in turn, reduces the number of computational operations required to process each image with the network. Cascaded Projection is the first end-to-end trainable method for network compression that uses standard backpropagation to learn the optimal tensor compression. In the area of deep feature embeddings, we introduce Deep Euclidean Feature Representations through Adaptation on the Grassmann manifold (DEFRAG), that leverages PMO. The DEFRAG method improves the feature embeddings learned by deep neural networks through the use of auxiliary loss functions and Grassmann manifold optimization. Lastly, in the area of visual domain adaptation, we propose the Manifold-Aligned Label Transfer for Domain Adaptation (MALT-DA) to transfer knowledge from samples in a known domain to an unknown domain based on cross-domain cluster correspondences.
Library of Congress Subject Headings
Grassman manifolds; Computer vision; Mathematical optimization
Publication Date
8-2019
Document Type
Dissertation
Student Type
Graduate
Degree Name
Engineering (Ph.D.)
Department, Program, or Center
Engineering (KGCOE)
Advisor
Andreas Savakis
Advisor/Committee Member
Christopher Kanan
Advisor/Committee Member
Andres Kwasinski
Recommended Citation
Minnehan, Breton Lawrence, "Deep Grassmann Manifold Optimization for Computer Vision" (2019). Thesis. Rochester Institute of Technology. Accessed from
https://repository.rit.edu/theses/10122
Campus
RIT – Main Campus
Plan Codes
ENGR-PHD