Retour à la liste des PSL-week

C1DATA--02 | Explainability and Interpretability in Artificial Neural Networks

Explainability and Interpretability in Artificial Neural Networks
31
English
Département d'Etudes Cognitives, ENS Paris;
29 rue d'Ulm
The dramatic success of machine learning over the last 10 years has led to an explosion of Artificial Intelligence (AI) applications models across science and society. Artificial neural networks are becoming a ubiquitous component of essential systems such as medical diagnosis, policy-making, transport or scientific research. Yet these algorithms essentially function as black boxes which produce outputs and decisions through a process that human users do not directly understand. This raises important issues of trust, security and fairness for the use of AI algorithms.

The emerging field of Explainable and Interpretable AI seeks to address these issues by developing methods for interpreting existing AI models and proposing new models that are built to be transparent. This course will provide an overview of the ongoing work in this domain.

Goals of the course:
- motivate why explainability and interpretability are important;
- develop an understanding of what interpretability and explainability mean in different context;
- provide an overview of methods and approaches for interpretable and explainable AI;
- study in detail a set of use cases in recent research papers
Preliminary program: ;

1. Introduction to Interpretability and Explainability;
what is interpretability in machine learning?
why interpret neural network models?
when is interpretation needed?
Interpretable vs Explainable AI;
;
2. Overview of Interpretability methods;
General setting;
Intrinsic vs posthoc interpretability;
Global vs local interpretability;
Evaluating explanations;
;
3. Inherently interpretable models;
Linear regression, generalized linear models, generalized additive models;
Decision trees;
Dynamical systems;
;
4. Model Agnostic methods;
Local Surrogate (LIME);
Counterfactual explanations;
Shapley values;
;
5. Post-hoc interpretation methods for neural networks;
Feature visualisation and Network dissection;
Attribution methods and saliency maps;
Applications: medical imaging;
;
6. Model Reduction;
Distillation;
Decision trees;
Application: Medical diagnosis, legal analysis;
Dynamical systems;
Application: scientific discovery in neuroscience;
;
7. Wrapup: Open issues and the road ahead;

 
Short presentation or report describing a specific use-case.

 
Strong background in linear algebra, probability and machine-learning. Basics of artificial networks. Programming in python.

 
OSTOJIC Srdjan

 
Srdjan Ostojic, coordinateur et enseignant