Abstract

Deep learning models have achieved remarkable success across diverse domains, yet their ``black-box" nature poses significant challenges to trust and adoption, particularly in high-stakes applications. This dissertation addresses the critical need for explainable artificial intelligence by advancing model interpretability through three complementary approaches: improved post-hoc explanation methods, internal mechanism understanding, and inherently interpretable architectures. First, we tackle the challenge of explaining video understanding models by proposing Frequency-based Extremal Perturbation (F-EP), which generates spatiotemporally consistent explanations for video classification. By leveraging frequency domain analysis and extending perturbation-based methods to the temporal dimension, F-EP achieves superior spatiotemporal consistency while maintaining high classification accuracy, significantly outperforming existing methods that suffer from background attribution, noisy explanations, and temporal inconsistency. Second, we develop the Generative Class-relevant Neural Pathway (GEN-CNP) model to understand deep networks' internal decision-making processes. Unlike existing approaches that apply uniform sparsity constraints, GEN-CNP introduces dynamic instance-specific sparsity and class-aware feature embedding to identify the most critical neurons. This approach substantially outperforms baseline methods in Average Class IOU metrics, enabling both individual prediction explanations and transferable class-specific interpretations that reveal how networks consistently represent semantic information. Third, we advance self-explainable neural networks by addressing fundamental limitations in prototype-based architectures. We introduce Diversity-Aware Prototype Learning (DAPL), which leverages multi-head self-attention to encourage diverse prototype learning while maintaining classification performance. Enhanced with a foreground-aware training framework, DAPL achieves significant improvements in classification accuracy on fine-grained datasets and remarkable reductions in prototype redundancy across multiple benchmarks. We further establish comprehensive quantitative evaluation metrics for prototype quality assessment, including precision-recall analysis, redundancy measurement, and distribution balance evaluation. Our contributions span the spectrum of explainability approaches, from post-hoc methods for existing models to inherently interpretable architectures. By introducing novel algorithms, evaluation metrics, and rigorous experimental validation across multiple domains, this work advances the field toward more transparent and trustworthy AI systems capable of providing meaningful explanations alongside accurate predictions.

Library of Congress Subject Headings

Deep learning (Machine learning)--Computer simulation; Neural networks (Computer science)

Publication Date

8-2025

Document Type

Dissertation

Student Type

Graduate

Degree Name

Computing and Information Sciences (Ph.D.)

Department, Program, or Center

Computing and Information Sciences Ph.D, Department of

College

Golisano College of Computing and Information Sciences

Advisor

Matthew Wright

Advisor/Committee Member

Rui Li

Advisor/Committee Member

Qi Yu

Recommended Citation

Lin, Xin Miao, "Model Explainability: From Post-hoc to Self-explainable" (2025). Thesis. Rochester Institute of Technology. Accessed from
https://repository.rit.edu/theses/12317

Campus

RIT – Main Campus

Plan Codes

COMPIS-PHD

Download

COinS

Theses

Model Explainability: From Post-hoc to Self-explainable

Abstract

Library of Congress Subject Headings

Publication Date

Document Type

Student Type

Degree Name

Department, Program, or Center

College

Advisor

Advisor/Committee Member

Advisor/Committee Member

Recommended Citation

Campus

Plan Codes

Search

Browse

Author Corner

RIT Links

Theses

Model Explainability: From Post-hoc to Self-explainable

Author

Abstract

Library of Congress Subject Headings

Publication Date

Document Type

Student Type

Degree Name

Department, Program, or Center

College

Advisor

Advisor/Committee Member

Advisor/Committee Member

Recommended Citation

Campus

Plan Codes

Share

Search

Browse

Author Corner

RIT Links