Abstract
Human intelligence is strong at adapting to a small number of observations, partially because of the human ability to 1) use given knowledge and 2) distill knowledge from related but different data to guide learning for future tasks, where such ability is the inductive bias during learning. Deep learning shows a promising solution to artificial intelligence. However, generalizing or adapting deep learning models to heterogeneous tasks remains an open question. Existing data-driven models often ignore prior knowledge about the underlying problems of interest, or have limitations in incorporating complex knowledge into neural networks. The one-size-fit-all formula assumes the training and testing data follow the same distribution, while the heterogeneity within the training data and the distribution shift from training time to test time lead to generalization error. In this dissertation, we approached these challenges from the perspective of improving the adaptation with inductive bias, primarily examining the following three research questions: 1) how to learn to adapt with unknown knowledge that can be learned from data, 2) how to adapt deep learning models with known prior knowledge, and 3) how to learn to identify hybrid knowledge with both known prior and unknown errors. To answer the first research question, we proposed a novel concept of learning to adapt to diverse dynamic environments in high-dimensional long-term time series forecasting. To answer the second research question, we first designed neural functions to model the spatiotemporal physics relationships defined on geometrical domains. We then proposed to improve the learning of neural networks given partially known physics with a hybrid state-space framework. For the last research question, we proposed a hybrid gray-box modeling combining the strength of learning to identify unknown errors from data and adapting with known physics. In this dissertation, we proposed several novel adaptation methods with good adaptation ability by drawing ideas from different well-studied areas such as variational inference (e.g. variational Bayes), image reconstruction (e.g. electrocardiographic imaging), time-series forecasting (e.g. sequential latent variable models), and few-shot learning (e.g. feedforward meta-learning). We evaluated our algorithms on synthetic data and real data in both general and clinical settings, and show that our approach yields significant improvement over existing methods. This, furthermore, opens the door for many new directions of research related to adaptation.
Library of Congress Subject Headings
Deep learning (Machine learning); Induction (Logic); Time-series analysis
Publication Date
4-2024
Document Type
Dissertation
Student Type
Graduate
Degree Name
Computing and Information Sciences (Ph.D.)
Department, Program, or Center
Computing and Information Sciences Ph.D, Department of
College
Golisano College of Computing and Information Sciences
Advisor
Linwei Wang
Advisor/Committee Member
Qi Yu
Advisor/Committee Member
Nathan Cahill
Recommended Citation
Jiang, Xiajun, "Improving Adaptation of Deep Learning with Inductive Bias" (2024). Thesis. Rochester Institute of Technology. Accessed from
https://repository.rit.edu/theses/11731
Campus
RIT – Main Campus
Plan Codes
COMPIS-PHD