Rethinking deep vision interpretability

For his PhD award talk at Gretsi, Thomas Fel gave a talk on explainability of deep vision models. Among the various approaches, attribution methods (Grad CAM) produce saliency maps indicating which part of the input is used by the neural network to build up its response.

However, these attribution methods are restricted to inform about the where and not the what. It does not tell you about the nature of the pattern the neural networks has extracted. In this work, T. Fel developes techniques to extract concepts from deep neural networks. Concepts are extracted by non-negative matrix factorization techniques (Gillis, 2024). A survey on concept based explainable AI can be found in (Poeta et al., 2023). In deep neural networks, there is nothing like the grand mother cell, a cell that would alone encode a feature; At least, this is not always true. The concepts are encoded by distributed patterns of activations (missing reference).

In his talk, a also mentioned the Linear Representation Hypothesis where features could be encoded as directions in the latent space of neural networks. However, as he mentioned, in practice, this does not appear to be the case because of the steering problem : pushing a representation toward such vectors actually steers the output along one feature but at some point fails.

In (Fel et al., 2023), he introduces a unifying framework of concept extraction techniques (e.g. K-means, PCA, or NMF): they basically all fall into the category of dictionary learning.

The Lens project provides illustrations of visualy concepts extracted from a large vision model trained on ImageNet.

Finally, I would like to mention the explainability toolbox Xplique which implements several explainability methods, in particular concept based approaches previously mentioned.

References

  1. Non negative matrix factorization
    Nicolas Gillis
    Jul 2024
  2. Concept-based Explainable Artificial Intelligence: A Survey
    Eleonora Poeta, Gabriele Ciravegna, Eliana Pastor, and 2 more authors
    Jul 2023
  3. A Holistic Approach to Unifying Automatic Concept Extraction and Concept Importance Estimation
    Thomas Fel, Victor Boutin, Mazda Moayeri, and 5 more authors
    Jul 2023



Enjoy Reading This Article?

Here are some more articles you might like to read next:

  • Atomic auto-encoders for learning sparse representations
  • Job token permissions to access registries
  • Regularized gradient descent
  • Test
  • Changing surrounding delimiters in VIM