Abstract
Deep Neural Networks have rapidly developed over the last few years, demonstrating state-of-the-art performances on various machine learning tasks such as image classification, natural language processing, and speech recognition. Despite their remarkable performance, deep neural networks are often criticized for their need for more interpretability, which makes it difficult to comprehend their decision-making process and get insights into their workings. Explainable AI has emerged as an important area of study that aims to overcome this issue by providing understandable explanations for deep neural network predictions. In this thesis, we focus on one of the explainability methods called Integrated Gradients (IG) and propose a contour-based analysis method for assessing the faithfulness of the IG algorithm. Our experiments on the IG algorithm showcase that it is an effective technique for generating attributions for deep neural networks. We found that the IG algorithm effectively generated attributions consistent with human intuition, highlighting relevant regions of the input images. However, there are still significant issues with the performance and interpretability of IG. For example, choosing the correct baselines for computing IG attributions is still important. The baseline in this context refers to the lack of features, which is used as a starting point to get the attributions. To address this issue, we assessed the performance of the IG algorithm by using multiple random baselines and aggregating the resulting attributions using mean and median techniques to obtain the final attribution. To evaluate the aggregated attributions, we propose a contour-based analysis method. This method provides an important continuous patch of aggregated IG attribution's top 10\% values. The continuous patch of important features allows us a more intuitive interpretation of IG's performance. We use the Captum library to implement the IG algorithm and experiment with multiple random baselines to compare the attributions generated by the IG algorithm. Our results demonstrate that the contour-based analysis method can be used to evaluate the performance of the IG algorithm for different baselines and can be applied to other attribution algorithms as well. Our findings suggest that the IG algorithm can identify the most critical elements of an image, and the contour-based approach can extract more localized and detailed information. Our research sheds light on the effectiveness of the multiple random baselines on the Integrated Gradients (IG) algorithm. It provides valuable insights into its performance when generating attributions for deep neural networks with different baselines. We also identify several limitations of our study, such as focusing on a single model architecture and data type and using a perturbation-based method to create random baselines. Future work can address these limitations by evaluating the performance of IG on other types of models and data using different ways to create the baselines.
Library of Congress Subject Headings
Deep learning (Machine learning); Explanation-based learning; Artificial intelligence; Neural networks (Computer science)
Publication Date
4-28-2023
Document Type
Thesis
Student Type
Graduate
Degree Name
Software Engineering (MS)
Department, Program, or Center
Software Engineering (GCCIS)
Advisor
Nidhi Rastogi
Advisor/Committee Member
Mohamed Wiem Mkaouer
Advisor/Committee Member
Daniel Krutz
Recommended Citation
Shewale, Ajay, "Investigating the Impact of Baselines on Integrated Gradients for Explainable AI" (2023). Thesis. Rochester Institute of Technology. Accessed from
https://repository.rit.edu/theses/11483
Campus
RIT – Main Campus
Plan Codes
SOFTENG-MS