Foundations of Interpretable Deep Learning

Tutorial held in conjunction with AAAI 2026
Singapore EXPO (Room Garnet 218), Singapore
January 21st, 2026 from 8:30am to 12:30pm

Introduction

As notoriously opaque deep neural networks (DNNs) become commonplace in powerful Artificial Intelligence (AI) systems, Interpretable Deep Learning (IDL) has emerged as a promising direction for designing interpretable-by-construction neural architectures. At their core, IDL models learn a latent space where some of their representations are aligned with high-level units of information, or concepts, that domain experts are familiar with (e.g., “stripped texture”,  “round object”, etc.). By introducing inductive biases that encourage predictions to be made based on these interpretable representations, IDL models enable the construction of expressive yet highly transparent architectures that can be vetted, analysed, and intervened on.

This tutorial aims to capitalise on the surge of interest in IDL by exposing AI researchers and engineers to the core foundations necessary to understand the general principles behind existing IDL models. By doing so, we aim to equip attendees with the knowledge necessary to comprehend the current state of this extensive body of literature, enabling them to build upon it for their research. Specifically, this tutorial will provide an overview of (1) core interpretability principles, (2) seminal works in the field, and (3) recent directions in IDL. Our tutorial will include hands-on demonstrations throughout our session and will conclude with a discussion of the key open questions in the field.

Required Background

Our material will assume a basic knowledge of ML (e.g., foundations of supervised learning, experimental design, basic probabilistic modelling, etc.), with particular emphasis on a solid Deep Learning foundation (e.g., tensor calculus, neural networks, backpropagation, etc.). Concepts that may require mathematical tools/expertise beyond those one would expect to be shared among the AAAI community will be (re)introduced in our tutorial.

All relevant material used and discussed during the tutorial, including a recording of the tutorial, will be made available here.

Previous Tutorial Iterations

A related but distinct previous iteration of this tutorial was ran at AAAI 2025 under the name of “Concept-based Interpretable Deep Learning”. A more closely related iteration of the present tutorial was ran as a short talk at Neuro-Symbolic AI Summer School 2025.

Please see here for further details on these previous tutorials, including slides and materials.

Important Details

  • Date: This tutorial will be held on January 21st, 2026 from 8:30am to 12:30 pm (incl. 30-minute break).
  • Conference: The 40th Annual AAAI Conference on Artificial Intelligence.
  • Location: Singapore EXPO (Room: Garnet 218), Singapore.
  • Modality: In-person event with the option to join online via Underline (requires AAAI tutorial registration).

Presenters

Contact

For any questions, please do not hesitate to contact Mateo at mateo.espinosazarlenga@trinity.ox.uk.