Academia.eduAcademia.edu

Outline

XAI Handbook: Towards a Unified Framework for Explainable AI

2021, 2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW)

Abstract
sparkles

AI

The field of explainable AI (XAI) has quickly become a thriving and prolific community. However, a silent, recurrent and acknowledged issue in this area is the lack of consensus regarding its terminology. In particular, each new contribution seems to rely on its own (and often intuitive) version of terms like "explanation" and "interpretation". Such disarray encumbers the consolidation of advances in the field towards the fulfillment of scientific and regulatory demands e.g., when comparing methods or establishing their compliance w.r.t. biases and fairness constraints. We propose a theoretical framework that not only provides concrete definitions for these terms, but it also outlines all steps necessary to produce explanations and interpretations. The framework also allows for existing contributions to be recontextualized such that their scope can be measured, thus making them comparable to other methods. We show that this framework is compliant with desiderata on explanations, on interpretability and on evaluation metrics. We present a use-case showing how the framework can be used to compare LIME, SHAP and MDNet, establishing their advantages and shortcomings. Finally, we discuss relevant trends in XAI as well as recommendations for future work, all from the standpoint of our framework.

References (57)

  1. Al-Shedivat, M., Dubey, A., and Xing, E. Contextual expla- nation networks. Journal of Machine Learning Research, 21(194):1-44, 2020. 1, 4, 6
  2. Alvarez-Melis, D. and Jaakkola, T. S. Towards robust in- terpretability with self-explaining neural networks. In Proceedings of the 32nd International Conference on Neural Information Processing Systems, pp. 7786-7795, 2018. 7, 8
  3. Arrieta, A. B., Díaz-Rodríguez, N., Del Ser, J., Bennetot, A., Tabik, S., Barbado, A., García, S., Gil-López, S., Molina, D., Benjamins, R., et al. Explainable artificial intelligence (xai): Concepts, taxonomies, opportunities and challenges toward responsible ai. Information Fusion, 58:82-115, 2020. 1
  4. Athalye, A., Carlini, N., and Wagner, D. Obfuscated gradients give a false sense of security: Circumvent- ing defenses to adversarial examples. In Dy, J. and Krause, A. (eds.), Proceedings of the 35th International Conference on Machine Learning, volume 80 of Pro- ceedings of Machine Learning Research, pp. 274-283, Stockholmsmässan, Stockholm Sweden, 10-15 Jul 2018. PMLR. 2
  5. Bach, S., Binder, A., Montavon, G., Klauschen, F., Müller, K.-R., and Samek, W. On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation. PloS one, 10(7):e0130140, 2015. 6
  6. Baraniuk, C. The 'creepy Facebook AI' story that capti- vated the media, Aug 2017. URL https://www.bbc. com/news/technology-40790258. 5
  7. Bau, D., Zhou, B., Khosla, A., Oliva, A., and Torralba, A. Network dissection: Quantifying interpretability of deep visual representations. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 6541-6549, 2017. 6
  8. Bibal, A. and Frénay, B. Interpretability of machine learning models and representations: an introduction. In ESANN, 2016. 1
  9. Biran, O. and Cotton, C. Explanation and justification in machine learning: A survey. In IJCAI-17 workshop on explainable AI (XAI), volume 8, pp. 8-13, 2017. 4
  10. Bussmann, N., Giudici, P., Marinelli, D., and Pa- penbrock, J. Explainable Machine Learning in Credit Risk Management. Computational Economics, September 2020. ISSN 1572-9974. doi: 10.1007/ s10614-020-10042-0. URL https://doi.org/10. 1007/s10614-020-10042-0. 1
  11. Carlini, N., Athalye, A., Papernot, N., Brendel, W., Rauber, J., Tsipras, D., Goodfellow, I., Madry, A., and Kurakin, A. On evaluating adversarial robustness. arXiv preprint arXiv:1902.06705, 2019. 2
  12. Carrieri, A. P., Haiminen, N., Maudsley-Barton, S., Gar- diner, L.-J., Murphy, B., Mayes, A., Paterson, S., Grimshaw, S., Winn, M., Shand, C., et al. Explainable ai reveals key changes in skin microbiome associated with menopause, smoking, aging and skin hydration. bioRxiv, 2020. 2
  13. Chang, C.-H., Creager, E., Goldenberg, A., and Duvenaud, D. Explaining image classifiers by counterfactual genera- tion. In International Conference on Learning Represen- tations, 2019. 7
  14. Ciatto, G., Calvaresi, D., Schumacher, M. I., and Omicini, A. An abstract framework for agent-based explanations in ai. In Proceedings of the 19th International Conference on Autonomous Agents and MultiAgent Systems, pp. 1816- 1818, 2020. 1
  15. Dam, H. K., Tran, T., and Ghose, A. Explainable soft- ware analytics. In Proceedings of the 40th International Conference on Software Engineering: New Ideas and Emerging Results, pp. 53-56, 2018. 2, 4
  16. D'Amour, A., Heller, K., Moldovan, D., Adlam, B., Ali- panahi, B., Beutel, A., Chen, C., Deaton, J., Eisenstein, J., Hoffman, M. D., et al. Underspecification presents challenges for credibility in modern machine learning. arXiv preprint arXiv:2011.03395, 2020. 3
  17. de Sousa, I. P., Vellasco, M. M. B. R., and da Silva, E. C. Evolved explainable classifications for lymph node metas- tases. arXiv preprint arXiv:2005.07229, 2020. 2
  18. Deng, J., Ding, N., Jia, Y., Frome, A., Murphy, K., Bengio, S., Li, Y., Neven, H., and Adam, H. Large-scale object classification using label relation graphs. In European conference on computer vision, pp. 48-64. Springer, 2014.
  19. Dombrowski, A.-K., Alber, M., Anders, C., Ackermann, M., Müller, K.-R., and Kessel, P. Explanations can be manipulated and geometry is to blame. In Advances in Neural Information Processing Systems, pp. 13589- 13600, 2019. 2
  20. Doshi-Velez, F. and Kim, B. Considerations for evaluation and generalization in interpretable machine learning. In Explainable and Interpretable Models in Computer Vision and Machine Learning, pp. 3-17. Springer, 2018. 1, 2, 4, 7, 8
  21. Esser, P., Rombach, R., and Ommer, B. A disentangling invertible interpretation network for explaining latent rep- resentations. In Proceedings of the IEEE/CVF Confer- ence on Computer Vision and Pattern Recognition, pp. 9223-9232, 2020. 6
  22. Frye, C., de Mijolla, D., Cowton, L., Stanley, M., and Feige, I. Shapley-based explainability on the data manifold. arXiv preprint arXiv:2006.01272, 2020. 9
  23. Ghorbani, A., Abid, A., and Zou, J. Interpretation of neural networks is fragile. In Proceedings of the AAAI Confer- ence on Artificial Intelligence, volume 33, pp. 3681-3688, 2019. 2
  24. Jain, S. and Wallace, B. C. Attention is not explanation. In Proceedings of the 2019 Conference of the North Amer- ican Chapter of the Association for Computational Lin- guistics: Human Language Technologies, Volume 1 (Long and Short Papers), pp. 3543-3556, 2019. 10
  25. Josephson, J. R. and Josephson, S. G. Abductive inference: Computation, philosophy, technology. Cambridge Uni- versity Press, 1996. 2, 4
  26. Kang, M.-J. and Kang, J.-W. Intrusion detection system using deep neural network for in-vehicle network security. PloS one, 11(6):e0155781, 2016. 1
  27. Kim, B., Wattenberg, M., Gilmer, J., Cai, C., Wexler, J., Viegas, F., et al. Interpretability beyond feature attribu- tion: Quantitative testing with concept activation vectors (TCAV). In International conference on machine learn- ing, pp. 2668-2677. PMLR, 2018. 6, 8
  28. King, T. D. Human color perception, cognition, and culture: why red is always red. In Color imaging X: processing, hardcopy, and applications, volume 5667, pp. 234-242. International Society for Optics and Photonics, 2005. 5
  29. Lakkaraju, H., Kamar, E., Caruana, R., and Leskovec, J. Faithful and customizable explanations of black box mod- els. In Proceedings of the 2019 AAAI/ACM Conference on AI, Ethics, and Society, pp. 131-138, 2019. 2, 4
  30. Leavitt, M. L. and Morcos, A. Towards falsifiable inter- pretability research, 2020. 1, 10
  31. Lewis, D. Causal explanation, philosophical papers, vol. 2, 1986. 2, 4
  32. Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ra- manan, D., Dollár, P., and Zitnick, C. L. Microsoft coco: Common objects in context. In European conference on computer vision, pp. 740-755. Springer, 2014. 3
  33. Lipton, Z. C. The mythos of model interpretability. Queue, 2018. 1, 2
  34. Lombrozo, T. The structure and function of explanations. Trends in cognitive sciences, 10(10):464-470, 2006. 2, 3, 4, 5, 6
  35. Lopez-Paz, D., Nishihara, R., Chintala, S., Scholkopf, B., and Bottou, L. Discovering causal signals in images. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6979-6987, 2017. 7
  36. Lundberg, S. M. and Lee, S.-I. A unified approach to inter- preting model predictions. In Advances in neural infor- mation processing systems, pp. 4765-4774, 2017. 1, 2, 4, 6, 8
  37. Miller, T. Explanation in artificial intelligence: Insights from the social sciences. Artificial Intelligence, 2019. 1, 2, 6, 7
  38. Miller, T., Howe, P., and Sonenberg, L. Explainable AI: Be- ware of inmates running the asylum. In IJCAI 2017 Work- shop on Explainable Artificial Intelligence (XAI), 2017. URL http://people.eng.unimelb.edu.au/ tmiller/pubs/explanation-inmates.pdf.
  39. Montavon, G., Samek, W., and Müller, K.-R. Methods for interpreting and understanding deep neural networks. Digital Signal Processing, 2018. 1, 2, 4, 10 of the European Union, C. and Parliament, E. Regulation (EU) 2016/679 of the European Parliament and of the Council of 27 April 2016 on the protection of natural persons with regard to the processing of personal data and on the free movement of such data, and repealing Directive 95/46/EC (General Data Protection Regulation), 2016. 1
  40. Palacio, S., Folz, J., Hees, J., Raue, F., Borth, D., and Den- gel, A. What do deep networks like to see. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2018. 8
  41. Ras, G., van Gerven, M., and Haselager, P. Explanation methods in deep learning: Users, values, concerns and challenges. In Explainable and Interpretable Models in Computer Vision and Machine Learning, pp. 19-36. Springer, 2018. 5
  42. Ribeiro, M. T., Singh, S., and Guestrin, C. " why should i trust you?" explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pp. 1135-1144, 2016. 2, 8
  43. Rudin, C. Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nature Machine Intelligence, 1(5):206- 215, 2019. 1, 2, 6, 9, 10
  44. Rudin, C. and Ustun, B. Optimized scoring systems: To- ward trust in machine learning for healthcare and criminal justice. Interfaces, 48(5):449-466, 2018. 1
  45. Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., et al. Imagenet large scale visual recognition chal- lenge. International journal of computer vision, 115(3): 211-252, 2015. 3
  46. Schmid, U. and Finzel, B. Mutual explanations for co- operative decision making in medicine. KI-Künstliche Intelligenz, pp. 1-7, 2020. 4
  47. Schneeberger, D., Stöger, K., and Holzinger, A. The euro- pean legal framework for medical ai. In Holzinger, A., Kieseberg, P., Tjoa, A. M., and Weippl, E. (eds.), Ma- chine Learning and Knowledge Extraction, pp. 209-226, Cham, 2020. Springer International Publishing. ISBN 978-3-030-57321-8. 2
  48. Slack, D., Hilgard, S., Jia, E., Singh, S., and Lakkaraju, H. Fooling LIME and SHAP: Adversarial attacks on post hoc explanation methods. In Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society, pp. 180-186, 2020. 2, 6, 8
  49. Smeulders, A. W., Worring, M., Santini, S., Gupta, A., and Jain, R. Content-based image retrieval at the end of the early years. IEEE Transactions on pattern analysis and machine intelligence, 22(12):1349-1380, 2000. 3
  50. Sokol, K. and Flach, P. Explainability fact sheets: A framework for systematic assessment of explainable ap- proaches. In Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency, FAT* '20, pp. 56-67, New York, NY, USA, 2020. Association for Computing Machinery. ISBN 9781450369367. doi: 10.1145/3351095.3372870. 5
  51. Sowa, J. F. Conceptual structures: information processing in mind and machine. Addison-Wesley Pub., Reading, MA, 1983. 3
  52. Szegedy, C., Zaremba, W., Sutskever, I., Bruna, J., Erhan, D., Goodfellow, I., and Fergus, R. Intriguing properties of neural networks. arXiv preprint arXiv:1312.6199, 2013.
  53. Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, Ł., and Polosukhin, I. Atten- tion is all you need. In Advances in neural information processing systems, pp. 5998-6008, 2017. 6
  54. Vilone, G. and Longo, L. Explainable artificial intelligence: a systematic review. arXiv preprint arXiv:2006.00093, 2020. 2, 4
  55. Wiegreffe, S. and Pinter, Y. Attention is not not explana- tion. In Proceedings of the 2019 Conference on Empir- ical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pp. 11-20, 2019. 10
  56. Xie, N., Ras, G., van Gerven, M., and Doran, D. Explainable deep learning: A field guide for the uninitiated. arXiv preprint arXiv:2004.14545, 2020. 2
  57. Zhang, Z., Xie, Y., Xing, F., McGough, M., and Yang, L. Mdnet: A semantically and visually interpretable medical image diagnosis network. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 6428-6436, 2017. 8