Despite the success of deep learning methods on object recognition tasks, one of the challenges deep learning systems face in the real world is the ability to perform well on the visually different data samples i.e. under a distribution shift caused by the samples of the same object category but from the significantly different visual domain. Many approaches have been proposed in both of these settings, however, not many other works focus on the generative modeling in this context, neither focus on studying the structure of hidden representations learned by the deep learning models. We hypothesize that learning the generative factors and studying the structures of features learned by the models can allow us to develop a new methodology for domain generalization and domain adaption setting. In this work, we propose a new methodology by designing a Variational Autoencoder (VAE) based model with structured three-part latent code representing specific aspects of data. We also make use of adversarial approaches to make the model robust towards changes in visual domains, to improve the domain generalization performance.

For the domain adaptation, we make use of semi-supervised learning as a primary tool to adapt model parameters to learn the new data distribution of the target domain. We propose a novel variation of data augmentation used in semi-supervised methods based on latent code sampling. We also propose a new adversarial constraint for domain adaptation which does not require explicit information about the ’domain’ of the new data sample. From empirical evaluation, our method performs on par with the other state-of-the-art methods in domain generalization setting, while improving state-of-the-art for multiple datasets in the domain adaptation setting.

Publication Date


Document Type


Student Type


Degree Name

Computer Science (MS)

Department, Program, or Center

Computer Science (GCCIS)


Linwei Wang

Advisor/Committee Member

Ifeoma Nwogu

Advisor/Committee Member

Yu Kong


RIT – Main Campus