XAI Handbook: Towards a Unified Framework for Explainable AI

Sebastian Palacio

Outline

XAI Handbook: Towards a Unified Framework for Explainable AI

Sebastian Palacio

2021, 2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW)

Abstract
AI

The field of explainable AI (XAI) has quickly become a thriving and prolific community. However, a silent, recurrent and acknowledged issue in this area is the lack of consensus regarding its terminology. In particular, each new contribution seems to rely on its own (and often intuitive) version of terms like "explanation" and "interpretation". Such disarray encumbers the consolidation of advances in the field towards the fulfillment of scientific and regulatory demands e.g., when comparing methods or establishing their compliance w.r.t. biases and fairness constraints. We propose a theoretical framework that not only provides concrete definitions for these terms, but it also outlines all steps necessary to produce explanations and interpretations. The framework also allows for existing contributions to be recontextualized such that their scope can be measured, thus making them comparable to other methods. We show that this framework is compliant with desiderata on explanations, on interpretability and on evaluation metrics. We present a use-case showing how the framework can be used to compare LIME, SHAP and MDNet, establishing their advantages and shortcomings. Finally, we discuss relevant trends in XAI as well as recommendations for future work, all from the standpoint of our framework.

Figures (4)

XAI Handbook: Towards a Unified Framework for Explainable AI Figure 1. Overview of our proposed framework. A task defined in the high-level domain gets an under-specified characterization, leaving out non-functional requirements. “Explanations” are meth- ods that probe for said requirements. Interpretations are mappings from low- to the high-level domain.

to ask what kind of semantic entity an explanation is: is it an action, an outcome, a process or an object? For most textbook definitions (see Table 2), explanations are seen as statements. In turn, such statements are nothing but descrip- tions about an already existing entity (the explanandum or the one which is subject to description). From a functional perspective, an explanation is therefore the process by which an explanandum is described. Finally, to avoid confusions between the process of explaining and its output, we refer to the former as explanation and the latter as the explanans. As we seek explanations for AI models, we find such suit- able facts in the form of low-level mathematical primitives used to build the models themselves: support vector ma- chines have a decision boundary equation, coordinates of the support vectors, the tolerance threshold, etc. A Neural Network has values for each individual parameter, the equa- tions governing how they connect with each other, the value of the cost function when computed on a particular input, the gradient that can be computed on that loss, etc. All of these are concrete, undisputed facts (assuming there are no bugs) that are suitable explanandums.

Table 2. Definitions of “explanation” and “interpretation” according to various dictionaries. Explanation is referred to as an object while interpretation is more commonly associated to an action. Accessed on 05.01.2021. requirements. Asking why something happens, inescapably relates back to causal effects. While these kind of relationships are among the most useful to discover (as it allows for more control over the effect by adjusting the cause), other non-causal relationships remain valuable in the toolset of explainability. Finding out that a model is unfair, without knowing what the cause of it is, can already be helpful in high-stake sce- narios (e.g., by preventing its use). We say that the why relates to the nature of an interpretation. In short, if the interpretation bares a causal meaning, then the why is being defined by the causal link. If the meaning is limited to a correlation, the why is left out of the scope of that particular interpretation. Note that, if the explanation method is al- ready based on causal theory (Lopez-Paz et al., 2017; Chang et al., 2019), the assigned meaning (i.e., the link between the explanans and the high-level (non-)functional requirement) will be more direct and therefore, more likely to withstand scientific scrutiny. The why is therefore not mandatory in explanations generated by XAI methods. In any case, the explanation’s context can and should be defined, be it causal or based solely on correlations. Proponents of XAI methods are responsible for clearly stating the context in which their explanations can be interpreted.

Table 3. Recontextualization of three popular XAI methods with the help of our proposed framework.

References (57)

Al-Shedivat, M., Dubey, A., and Xing, E. Contextual expla- nation networks. Journal of Machine Learning Research, 21(194):1-44, 2020. 1, 4, 6
Alvarez-Melis, D. and Jaakkola, T. S. Towards robust in- terpretability with self-explaining neural networks. In Proceedings of the 32nd International Conference on Neural Information Processing Systems, pp. 7786-7795, 2018. 7, 8
Arrieta, A. B., Díaz-Rodríguez, N., Del Ser, J., Bennetot, A., Tabik, S., Barbado, A., García, S., Gil-López, S., Molina, D., Benjamins, R., et al. Explainable artificial intelligence (xai): Concepts, taxonomies, opportunities and challenges toward responsible ai. Information Fusion, 58:82-115, 2020. 1
Athalye, A., Carlini, N., and Wagner, D. Obfuscated gradients give a false sense of security: Circumvent- ing defenses to adversarial examples. In Dy, J. and Krause, A. (eds.), Proceedings of the 35th International Conference on Machine Learning, volume 80 of Pro- ceedings of Machine Learning Research, pp. 274-283, Stockholmsmässan, Stockholm Sweden, 10-15 Jul 2018. PMLR. 2
Bach, S., Binder, A., Montavon, G., Klauschen, F., Müller, K.-R., and Samek, W. On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation. PloS one, 10(7):e0130140, 2015. 6
Baraniuk, C. The 'creepy Facebook AI' story that capti- vated the media, Aug 2017. URL https://www.bbc. com/news/technology-40790258. 5
Bau, D., Zhou, B., Khosla, A., Oliva, A., and Torralba, A. Network dissection: Quantifying interpretability of deep visual representations. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 6541-6549, 2017. 6
Bibal, A. and Frénay, B. Interpretability of machine learning models and representations: an introduction. In ESANN, 2016. 1
Biran, O. and Cotton, C. Explanation and justification in machine learning: A survey. In IJCAI-17 workshop on explainable AI (XAI), volume 8, pp. 8-13, 2017. 4
Bussmann, N., Giudici, P., Marinelli, D., and Pa- penbrock, J. Explainable Machine Learning in Credit Risk Management. Computational Economics, September 2020. ISSN 1572-9974. doi: 10.1007/ s10614-020-10042-0. URL https://doi.org/10. 1007/s10614-020-10042-0. 1
Carlini, N., Athalye, A., Papernot, N., Brendel, W., Rauber, J., Tsipras, D., Goodfellow, I., Madry, A., and Kurakin, A. On evaluating adversarial robustness. arXiv preprint arXiv:1902.06705, 2019. 2
Carrieri, A. P., Haiminen, N., Maudsley-Barton, S., Gar- diner, L.-J., Murphy, B., Mayes, A., Paterson, S., Grimshaw, S., Winn, M., Shand, C., et al. Explainable ai reveals key changes in skin microbiome associated with menopause, smoking, aging and skin hydration. bioRxiv, 2020. 2
Chang, C.-H., Creager, E., Goldenberg, A., and Duvenaud, D. Explaining image classifiers by counterfactual genera- tion. In International Conference on Learning Represen- tations, 2019. 7
Ciatto, G., Calvaresi, D., Schumacher, M. I., and Omicini, A. An abstract framework for agent-based explanations in ai. In Proceedings of the 19th International Conference on Autonomous Agents and MultiAgent Systems, pp. 1816- 1818, 2020. 1
Dam, H. K., Tran, T., and Ghose, A. Explainable soft- ware analytics. In Proceedings of the 40th International Conference on Software Engineering: New Ideas and Emerging Results, pp. 53-56, 2018. 2, 4
D'Amour, A., Heller, K., Moldovan, D., Adlam, B., Ali- panahi, B., Beutel, A., Chen, C., Deaton, J., Eisenstein, J., Hoffman, M. D., et al. Underspecification presents challenges for credibility in modern machine learning. arXiv preprint arXiv:2011.03395, 2020. 3
de Sousa, I. P., Vellasco, M. M. B. R., and da Silva, E. C. Evolved explainable classifications for lymph node metas- tases. arXiv preprint arXiv:2005.07229, 2020. 2
Deng, J., Ding, N., Jia, Y., Frome, A., Murphy, K., Bengio, S., Li, Y., Neven, H., and Adam, H. Large-scale object classification using label relation graphs. In European conference on computer vision, pp. 48-64. Springer, 2014.
Dombrowski, A.-K., Alber, M., Anders, C., Ackermann, M., Müller, K.-R., and Kessel, P. Explanations can be manipulated and geometry is to blame. In Advances in Neural Information Processing Systems, pp. 13589- 13600, 2019. 2
Doshi-Velez, F. and Kim, B. Considerations for evaluation and generalization in interpretable machine learning. In Explainable and Interpretable Models in Computer Vision and Machine Learning, pp. 3-17. Springer, 2018. 1, 2, 4, 7, 8
Esser, P., Rombach, R., and Ommer, B. A disentangling invertible interpretation network for explaining latent rep- resentations. In Proceedings of the IEEE/CVF Confer- ence on Computer Vision and Pattern Recognition, pp. 9223-9232, 2020. 6
Frye, C., de Mijolla, D., Cowton, L., Stanley, M., and Feige, I. Shapley-based explainability on the data manifold. arXiv preprint arXiv:2006.01272, 2020. 9
Ghorbani, A., Abid, A., and Zou, J. Interpretation of neural networks is fragile. In Proceedings of the AAAI Confer- ence on Artificial Intelligence, volume 33, pp. 3681-3688, 2019. 2
Jain, S. and Wallace, B. C. Attention is not explanation. In Proceedings of the 2019 Conference of the North Amer- ican Chapter of the Association for Computational Lin- guistics: Human Language Technologies, Volume 1 (Long and Short Papers), pp. 3543-3556, 2019. 10
Josephson, J. R. and Josephson, S. G. Abductive inference: Computation, philosophy, technology. Cambridge Uni- versity Press, 1996. 2, 4
Kang, M.-J. and Kang, J.-W. Intrusion detection system using deep neural network for in-vehicle network security. PloS one, 11(6):e0155781, 2016. 1
Kim, B., Wattenberg, M., Gilmer, J., Cai, C., Wexler, J., Viegas, F., et al. Interpretability beyond feature attribu- tion: Quantitative testing with concept activation vectors (TCAV). In International conference on machine learn- ing, pp. 2668-2677. PMLR, 2018. 6, 8
King, T. D. Human color perception, cognition, and culture: why red is always red. In Color imaging X: processing, hardcopy, and applications, volume 5667, pp. 234-242. International Society for Optics and Photonics, 2005. 5
Lakkaraju, H., Kamar, E., Caruana, R., and Leskovec, J. Faithful and customizable explanations of black box mod- els. In Proceedings of the 2019 AAAI/ACM Conference on AI, Ethics, and Society, pp. 131-138, 2019. 2, 4
Leavitt, M. L. and Morcos, A. Towards falsifiable inter- pretability research, 2020. 1, 10
Lewis, D. Causal explanation, philosophical papers, vol. 2, 1986. 2, 4
Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ra- manan, D., Dollár, P., and Zitnick, C. L. Microsoft coco: Common objects in context. In European conference on computer vision, pp. 740-755. Springer, 2014. 3
Lipton, Z. C. The mythos of model interpretability. Queue, 2018. 1, 2
Lombrozo, T. The structure and function of explanations. Trends in cognitive sciences, 10(10):464-470, 2006. 2, 3, 4, 5, 6
Lopez-Paz, D., Nishihara, R., Chintala, S., Scholkopf, B., and Bottou, L. Discovering causal signals in images. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6979-6987, 2017. 7
Lundberg, S. M. and Lee, S.-I. A unified approach to inter- preting model predictions. In Advances in neural infor- mation processing systems, pp. 4765-4774, 2017. 1, 2, 4, 6, 8
Miller, T. Explanation in artificial intelligence: Insights from the social sciences. Artificial Intelligence, 2019. 1, 2, 6, 7
Miller, T., Howe, P., and Sonenberg, L. Explainable AI: Be- ware of inmates running the asylum. In IJCAI 2017 Work- shop on Explainable Artificial Intelligence (XAI), 2017. URL http://people.eng.unimelb.edu.au/ tmiller/pubs/explanation-inmates.pdf.
Montavon, G., Samek, W., and Müller, K.-R. Methods for interpreting and understanding deep neural networks. Digital Signal Processing, 2018. 1, 2, 4, 10 of the European Union, C. and Parliament, E. Regulation (EU) 2016/679 of the European Parliament and of the Council of 27 April 2016 on the protection of natural persons with regard to the processing of personal data and on the free movement of such data, and repealing Directive 95/46/EC (General Data Protection Regulation), 2016. 1
Palacio, S., Folz, J., Hees, J., Raue, F., Borth, D., and Den- gel, A. What do deep networks like to see. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2018. 8
Ras, G., van Gerven, M., and Haselager, P. Explanation methods in deep learning: Users, values, concerns and challenges. In Explainable and Interpretable Models in Computer Vision and Machine Learning, pp. 19-36. Springer, 2018. 5
Ribeiro, M. T., Singh, S., and Guestrin, C. " why should i trust you?" explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pp. 1135-1144, 2016. 2, 8
Rudin, C. Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nature Machine Intelligence, 1(5):206- 215, 2019. 1, 2, 6, 9, 10
Rudin, C. and Ustun, B. Optimized scoring systems: To- ward trust in machine learning for healthcare and criminal justice. Interfaces, 48(5):449-466, 2018. 1
Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., et al. Imagenet large scale visual recognition chal- lenge. International journal of computer vision, 115(3): 211-252, 2015. 3
Schmid, U. and Finzel, B. Mutual explanations for co- operative decision making in medicine. KI-Künstliche Intelligenz, pp. 1-7, 2020. 4
Schneeberger, D., Stöger, K., and Holzinger, A. The euro- pean legal framework for medical ai. In Holzinger, A., Kieseberg, P., Tjoa, A. M., and Weippl, E. (eds.), Ma- chine Learning and Knowledge Extraction, pp. 209-226, Cham, 2020. Springer International Publishing. ISBN 978-3-030-57321-8. 2
Slack, D., Hilgard, S., Jia, E., Singh, S., and Lakkaraju, H. Fooling LIME and SHAP: Adversarial attacks on post hoc explanation methods. In Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society, pp. 180-186, 2020. 2, 6, 8
Smeulders, A. W., Worring, M., Santini, S., Gupta, A., and Jain, R. Content-based image retrieval at the end of the early years. IEEE Transactions on pattern analysis and machine intelligence, 22(12):1349-1380, 2000. 3
Sokol, K. and Flach, P. Explainability fact sheets: A framework for systematic assessment of explainable ap- proaches. In Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency, FAT* '20, pp. 56-67, New York, NY, USA, 2020. Association for Computing Machinery. ISBN 9781450369367. doi: 10.1145/3351095.3372870. 5
Sowa, J. F. Conceptual structures: information processing in mind and machine. Addison-Wesley Pub., Reading, MA, 1983. 3
Szegedy, C., Zaremba, W., Sutskever, I., Bruna, J., Erhan, D., Goodfellow, I., and Fergus, R. Intriguing properties of neural networks. arXiv preprint arXiv:1312.6199, 2013.
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, Ł., and Polosukhin, I. Atten- tion is all you need. In Advances in neural information processing systems, pp. 5998-6008, 2017. 6
Vilone, G. and Longo, L. Explainable artificial intelligence: a systematic review. arXiv preprint arXiv:2006.00093, 2020. 2, 4
Wiegreffe, S. and Pinter, Y. Attention is not not explana- tion. In Proceedings of the 2019 Conference on Empir- ical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pp. 11-20, 2019. 10
Xie, N., Ras, G., van Gerven, M., and Doran, D. Explainable deep learning: A field guide for the uninitiated. arXiv preprint arXiv:2004.14545, 2020. 2
Zhang, Z., Xie, Y., Xing, F., McGough, M., and Yang, L. Mdnet: A semantically and visually interpretable medical image diagnosis network. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 6428-6436, 2017. 8

XAI Handbook: Towards a Unified Framework for Explainable AI

Sign up for access to the world's latest research

AbstractAI

Related papers

References (57)

Related papers

Related topics

Abstract
AI