Abstract

Human intelligence is strong at adapting to a small number of observations, partially because of the human ability to 1) use given knowledge and 2) distill knowledge from related but different data to guide learning for future tasks, where such ability is the inductive bias during learning. Deep learning shows a promising solution to artificial intelligence. However, generalizing or adapting deep learning models to heterogeneous tasks remains an open question. Existing data-driven models often ignore prior knowledge about the underlying problems of interest, or have limitations in incorporating complex knowledge into neural networks. The one-size-fit-all formula assumes the training and testing data follow the same distribution, while the heterogeneity within the training data and the distribution shift from training time to test time lead to generalization error. In this dissertation, we approached these challenges from the perspective of improving the adaptation with inductive bias, primarily examining the following three research questions: 1) how to learn to adapt with unknown knowledge that can be learned from data, 2) how to adapt deep learning models with known prior knowledge, and 3) how to learn to identify hybrid knowledge with both known prior and unknown errors. To answer the first research question, we proposed a novel concept of learning to adapt to diverse dynamic environments in high-dimensional long-term time series forecasting. To answer the second research question, we first designed neural functions to model the spatiotemporal physics relationships defined on geometrical domains. We then proposed to improve the learning of neural networks given partially known physics with a hybrid state-space framework. For the last research question, we proposed a hybrid gray-box modeling combining the strength of learning to identify unknown errors from data and adapting with known physics. In this dissertation, we proposed several novel adaptation methods with good adaptation ability by drawing ideas from different well-studied areas such as variational inference (e.g. variational Bayes), image reconstruction (e.g. electrocardiographic imaging), time-series forecasting (e.g. sequential latent variable models), and few-shot learning (e.g. feedforward meta-learning). We evaluated our algorithms on synthetic data and real data in both general and clinical settings, and show that our approach yields significant improvement over existing methods. This, furthermore, opens the door for many new directions of research related to adaptation.

Library of Congress Subject Headings

Deep learning (Machine learning); Induction (Logic); Time-series analysis

Publication Date

4-2024

Document Type

Dissertation

Student Type

Graduate

Degree Name

Computing and Information Sciences (Ph.D.)

Department, Program, or Center

Computing and Information Sciences Ph.D, Department of

College

Golisano College of Computing and Information Sciences

Advisor

Linwei Wang

Advisor/Committee Member

Qi Yu

Advisor/Committee Member

Nathan Cahill

Campus

RIT – Main Campus

Plan Codes

COMPIS-PHD

Share

COinS