This repository is a work-in-progress. The notebooks are used to explore different methods for interpretable AI, and will be updated as I learn more about the topic.
Interpretable AI is a subfield of AI that focuses on making AI models more interpretable. This is important for a number of reasons, including:
- Understanding how the model makes predictions
- Debugging the model
- Ensuring that the model is fair and unbiased
- Building trust with users
- Complying with regulations
There are a number of methods for making AI models more interpretable. Some of these methods include:
- Feature importance
- Partial dependence plots
- Shapley values
- LIME
- Rule-based models
- Surrogate models
- And many more
The following notebooks are available in this repository:
- Interpretable Machine Learning, Christoph Molnar
- LIME "Why Should I Trust You?": Explaining the Predictions of Any Classifier, Marco Tulio Ribeiro, Sameer Singh, Carlos Guestrin, 2016
- SHAP (SHapley Additive exPlanations) "A Unified Approach to Interpreting Model Predictions" Lundberg et al., 2017
- Counterfactual "Counterfactual Explanations without Opening the Black Box: Automated Decisions and the GDPR", Wachter et. al., 2017
- Evaluating LIME and SHAP using Counterfactuals"Towards Unifying Feature Attribution and Counterfactual Explanations: Different Means to the Same End", Mothilal et. al., 2021