Abstract

Domain Adaptation (DA) techniques aim to overcome the domain shift between a source domain used for training and the target domain used for testing and deployment. Domain adaptation methods assume the entire target domain is accessible during the adaptation process. We use efficient architectures in the continual data-constrained DA paradigm where the unlabeled data in the target domain is received continually in batches. In recent years, Vision Transformers have emerged as an alternative to traditional Convolutional Neural Networks (CNNs) as the feature extraction backbone for image classification and other computer vision tasks. Within the field of DA, these attention-based architectures have proven to be more powerful than their traditional counterparts. However, they possess a larger computational overhead due to their model size. We design a novel framework, called Continual Domain Adaptation through Knowledge Distillation (CAKE), that uses knowledge distillation (KD) to transfer to a CNN the more complex Vision Transformer's knowledge. By doing so, CNN-based adaptation obtains a similar performance to transformer-based adaptation while reducing the computational overhead. With this framework, we selectively choose samples from these batches to store in a buffer for selective replay. We mix the samples from the buffer with the incoming samples to incrementally update and adapt our model. We show that distilling to a smaller network after adapting a larger model allows the smaller network to achieve better accuracy than if the smaller network adapted to the target domain. We also demonstrate that CAKE outperforms state-of-the-art unsupervised domain adaptation methods without full access to the target domain or any access to the source domain.

Library of Congress Subject Headings

Transfer learning (Machine learning); Neural networks (Computer science); Convolutions (Mathematics)

Publication Date

7-2023

Document Type

Thesis

Student Type

Graduate

Degree Name

Computer Engineering (MS)

Department, Program, or Center

Computer Engineering (KGCOE)

Advisor

Andreas Savakis

Advisor/Committee Member

Dongfang Liu

Advisor/Committee Member

Andres Kwasinski

Recommended Citation

Thomas, Georgi, "Continual Domain Adaptation through Knowledge Distillation" (2023). Thesis. Rochester Institute of Technology. Accessed from
https://repository.rit.edu/theses/11545

Campus

RIT – Main Campus

Plan Codes

CMPE-MS

Download

COinS

Theses

Continual Domain Adaptation through Knowledge Distillation

Abstract

Library of Congress Subject Headings

Publication Date

Document Type

Student Type

Degree Name

Department, Program, or Center

Advisor

Advisor/Committee Member

Advisor/Committee Member

Recommended Citation

Campus

Plan Codes

Search

Browse

Author Corner

RIT Links

Theses

Continual Domain Adaptation through Knowledge Distillation

Author

Abstract

Library of Congress Subject Headings

Publication Date

Document Type

Student Type

Degree Name

Department, Program, or Center

Advisor

Advisor/Committee Member

Advisor/Committee Member

Recommended Citation

Campus

Plan Codes

Share

Search

Browse

Author Corner

RIT Links