In this dissertation, we are interested in improving the generalization of deep neural networks for biomedical data (e.g., electrocardiogram signal, x-ray images, etc). Although deep neural networks have attained state-of-the-art performance and, thus, deployment across a variety of domains, similar performance in the clinical setting remains challenging due to its ineptness to generalize across unseen data (e.g., new patient cohort).

We address this challenge of generalization in the deep neural network from two perspectives: 1) learning disentangled representations from the deep network, and 2) developing efficient semi-supervised learning (SSL) algorithms using the deep network.

In the former, we are interested in designing specific architectures and objective functions to learn representations, where variations in the data are well separated, i.e., disentangled. In the latter, we are interested in designing regularizers that encourage the underlying neural function's behavior toward a common inductive bias to avoid over-fitting the function to small labeled data.

Our end goal is to improve the generalization of the deep network for the diagnostic model in both of these approaches. In disentangled representations, this translates to appropriately learning latent representations from the data, capturing the observed input's underlying explanatory factors in an independent and interpretable way. With data's expository factors well separated, such disentangled latent space can then be useful for a large variety of tasks and domains within data distribution even with a small amount of labeled data, thus improving generalization. In developing efficient semi-supervised algorithms, this translates to utilizing a large volume of the unlabelled dataset to assist the learning from the limited labeled dataset, commonly encountered situation in the biomedical domain.

By drawing ideas from different areas within deep learning like representation learning (e.g., autoencoder), variational inference (e.g., variational autoencoder), Bayesian nonparametric (e.g., beta-Bernoulli process), learning theory (e.g., analytical learning theory), function smoothing (Lipschitz Smoothness), etc., we propose several leaning algorithms to improve generalization in the associated task. We test our algorithms on real-world clinical data and show that our approach yields significant improvement over existing methods. Moreover, we demonstrate the efficacy of the proposed models in the benchmark data and simulated data to understand different aspects of the proposed learning methods.

We conclude by identifying some of the limitations of the proposed methods, areas of further improvement, and broader future directions for the successful adoption of AI models in the clinical environment.

Library of Congress Subject Headings

Medical informatics; Neural networks (Computer science); Machine learning; Supervised learning (Machine learning)

Publication Date


Document Type


Student Type


Degree Name

Computing and Information Sciences (Ph.D.)

Department, Program, or Center

Computer Science (GCCIS)


Linwei Wang

Advisor/Committee Member

Qi Yu

Advisor/Committee Member

Rui Li


RIT – Main Campus

Plan Codes