Abstract
Deep learning has emerged as a powerful tool in medical imaging, assisting healthcare professionals with several decision-making tasks, such as disease diagnosis, surgical intervention, and treatment planning. Supervised deep learning methods typically require large amounts of high-quality labeled data for training. However, acquiring high-quality labeled data in the medical domain is challenging due to factors such as the high cost of expert annotation and the presence of label noise, often due to inherent user/expert annotation variability. Models trained on limited or noisy labeled data suffer from poor performance due to overfitting, thus reducing their generalizability and trustworthiness for medical applications. During recent years, several works have been proposed to tackle the challenge of learning with limited and noisy labeled data in general machine learning. However, the complexity of medical data, including factors like subtle distinguishing features, imbalanced classes, and different imaging modalities, makes this challenge even more prevalent in the medical domain. This dissertation explores multiple approaches to overcome the issue of learning with limited and noisy labeled data for robust medical image applications. We first began by investigating the impact of class-dependent label noise on medical image classifiers to understand the effects when noisy and clean target classes are visually similar. We then introduced a framework to enhance robustness against noisy labels using self-supervised pretraining. As multiple factors influence learning with noisy labels in medical image classification, including the number of classes, dataset complexity, learning with noisy label methods, noise types, and self-supervised pretraining approaches, we conducted an in-depth study on the benefits of self-supervised pretraining in improving robustness against label noise across various datasets, taking these factors into account. Next, to tackle the challenge of limited labeled medical data, we proposed an active learning pipeline that leverages multimodal information to learn from limited labeled data, reducing annotation cost. Furthermore, we developed a robust framework for training with noisy labels in imbalanced medical image classification by separating noisy from clean labels and gradually relabeling some critical incorrect samples selected using active learning techniques. Our final contribution is aimed towards robust multimodal learning by addressing the issue of hallucination in Vision Language Models (VLMs). For this application, we created a vision-language medical dataset with hallucination-aware annotations and established initial benchmarks for VLMs, laying the groundwork for future research in medical applications.
Library of Congress Subject Headings
Diagnostic imaging--Data processing; Deep learning (Machine learning); Computer vision; Natural language processing (Computer science); Electronic noise
Publication Date
4-2025
Document Type
Dissertation
Student Type
Graduate
Degree Name
Imaging Science (Ph.D.)
Department, Program, or Center
Chester F. Carlson Center for Imaging Science
College
College of Science
Advisor
Christian Linte
Advisor/Committee Member
Binod Bhattarai
Advisor/Committee Member
Bishesh Khanal
Recommended Citation
Khanal, Bidur, "Towards Robust Deep Learning for Medical Imaging with Limited and Noisy Labeled Data" (2025). Thesis. Rochester Institute of Technology. Accessed from
https://repository.rit.edu/theses/12136
Campus
RIT – Main Campus
Plan Codes
IMGS-PHD