Explainable AI (XAI): Core ideas, techniques, and solutions, CSUR 2023, paper
Techniques for interpretable machine learning, Communications of the ACM, 2019, paper
Post-hoc interpretability for neural nlp: A survey, CSUR 2023, paper
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead, Nature Machine Intelligence, 2019, paper
Interpretable machine learning: definitions, methods, and applications, 2019, paper
Evaluating explanation methods for deep learning in security, EuroS&P 2020, paper
A survey of methods for explaining black box models, ACM Computing Surveys, 2020, paper
On the explainability of natural language processing deep models, CSUR, 2022, paper
From Anecdotal Evidence to Quantitative Evaluation Methods: A Systematic Review on Evaluating Explainable AI, CSUR, 2023, paper, repository
Surveys for XAI for LM4Code
A systematic literature review of explainable AI for software engineering, 2023, paper
Explainable ai for se: Challenges and future directions, 2023, paper
This is an exploration and visualization website for a categorization of explainable AI papers, hosted on Github pages. The initial set of XAI papers was collected and labeled by Nauta et al. (2023) as part of a large-scale literature review on the evaluation of Explainable AI, published in ACM Computing Surveys.
SHAP (SHapley Additive exPlanations) is a game theoretic approach to explain the output of any machine learning model. It connects optimal credit allocation with local explanations using the classic Shapley values from game theory and their related extensions.
Lime is about explaining what machine learning classifiers (or models) are doing. At the moment, it supports explaining individual predictions for text classifiers or classifiers that act on tables (numpy arrays of numerical or categorical data) or images, with a package called lime (short for local interpretable model-agnostic explanations).
Official PyTorch implementation for Transformer Interpretability Beyond Attention Visualization, a novel method to visualize classifications by Transformer-based networks.
Explain, analyze, and visualize NLP language models. Ecco creates interactive visualizations directly in Jupyter notebooks explaining the behavior of Transformer-based language models (like GPT2, BERT, RoBERTA, T5, and T0).
Captum is a model interpretability and understanding library for PyTorch. Captum means comprehension in Latin and contains general-purpose implementations of integrated gradients, saliency maps, smooth grad, vargrad, and others for PyTorch models. It has quick integration for models built with domain-specific libraries such as torchvision, torchtext, and others.
The AI Explainability 360 toolkit is an open-source library that supports interpretability and explainability of datasets and machine learning models. The AI Explainability 360 Python package includes a comprehensive set of algorithms that cover different dimensions of explanations along with proxy explainability metrics. The AI Explainability 360 toolkit supports tabular, text, images, and time series data.