Model Transparency

description17 papers

group0 followers

lightbulbAbout this topic

Model transparency refers to the degree to which the internal workings and decision-making processes of a model, particularly in machine learning and artificial intelligence, are understandable and interpretable by humans. It emphasizes clarity in how models operate, enabling stakeholders to assess their reliability, fairness, and accountability.

lightbulbAbout this topic

Key research themes

1. How can standardized frameworks define and assess transparency levels for diverse stakeholders in autonomous and AI systems?

This research area focuses on developing measurable, testable standards to specify and assess transparency in autonomous systems, addressing the varying needs of different stakeholders such as users, regulators, and investigators. Establishing such frameworks matters to ensure accountability, trust, and safety by making AI systems understandable and their decisions explicable across multiple application contexts.

IEEE P7001: A Proposed Standard on Transparency

by RODERICK MUTTRAM

2022, Frontiers in Robotics and AI

Key finding: Introduces IEEE P7001 draft standard as a structured approach defining testable transparency levels tailored to five stakeholder groups (users, public/bystanders, safety agencies, investigators, and lawyers). The standard... Read more

articleView Paper downloadDownload

Towards Transparency by Design for Artificial Intelligence

by Christoph Lutz

2020, Science and Engineering Ethics

Key finding: Develops the concept of Transparency by Design (TbD) integrating contextual, technical, informational, and stakeholder-sensitive principles into AI system development. Proposes nine principles inspired by privacy-by-design... Read more

articleView Paper downloadDownload

Method cards for prescriptive machine-learning transparency

by Jeremy Sawruk

2023, Proceedings of the 1st International Conference on AI Engineering: Software Engineering for AI

Key finding: Proposes Method Cards as prescriptive documentation artifacts that go beyond descriptive transparency by providing actionable guidance to ML engineers on model reproduction, design rationales, and mitigation strategies for... Read more

articleView Paper downloadDownload

keyboard_arrow_downShow more

2. What are the epistemic and practical challenges underlying transparency in complex AI-driven simulations and computational systems?

This research theme investigates the nature of opacity and transparency in complex computational systems such as AI, computer simulations, and big data applications. It explores the conceptual limits of knowledge and understanding about system internals, addresses the multiple layers of opacity, and evaluates how partial or instrumental transparency can be attained to support scientific explanations, artifact detection, and trustworthy deployment.

Opacity thought through: on the intransparency of computer simulations

by Claus Beisbart

2022, Synthese

Key finding: Reconceptualizes opacity beyond Humphreys’ computational steps inaccessible by hand, defining opacity as the disposition to resist epistemic access including forms of knowledge and understanding. It distinguishes different... Read more

articleView Paper downloadDownload

Transparency in Complex Computational Systems

by Kathleen A . Creel

2020, Philosophy of Science

Key finding: Analyzes transparency as consisting of three forms—functional transparency (algorithmic functioning), structural transparency (implementation in code), and run transparency (actual execution on hardware and data)—to address... Read more

articleView Paper downloadDownload

keyboard_arrow_downShow more

3. How do methodological and user-centered approaches advance transparency in interpretability, data documentation, and explanation of AI models?

This area studies practical frameworks and methodologies to improve transparency through structured documentation, interpretability techniques, and user-centric explanations. It focuses on additive versus non-additive model explanations, transparent documentation of datasets and processes, and enhanced user communication to bridge the gap between technical AI design and interpretability by diverse stakeholders.

Transparent Model Distillation

by Rich Caruana

2025, arXiv (Cornell University)

Key finding: Compares multiple additive explanation methods (partial dependence, Shapley explanations, distilled additive explanations, gradient-based explanations) for black-box models, revealing that distilled additive explanations... Read more

articleView Paper downloadDownload

Complicating Methodological Transparency

by Sarah Bridges-Rhoads and

2015

Key finding: Explores transparency as a situated and evolving process in qualitative research methodology, emphasizing 'methodological data' as reflexive artifacts that complicate simplistic accounts of transparency. Argues that... Read more

articleView Paper downloadDownload

Method cards for prescriptive machine-learning transparency

by Jeremy Sawruk

2023, Proceedings of the 1st International Conference on AI Engineering: Software Engineering for AI

Key finding: Proposes Method Cards as a novel documentation tool for machine learning that combines descriptive information with prescriptive guidance. These cards facilitate model reproduction, understand design choices, and provide... Read more

articleView Paper downloadDownload

keyboard_arrow_downShow more

All papers in Model Transparency

Evaluating Methods for Assessing Interpretability of Deep Neural Networks (DNNs

by Emily Barnes

2024, International Journal of Multidisciplinary and CurrentEducational Research (IJMCER)

The interpretability of deep neural networks (DNNs) is a critical focus in artificial intelligence (AI) and machine learning (ML), particularly as these models are increasingly deployed in high-stakes applications such as healthcare,... more

descriptionView Paper arrow_downwardDownload

Understanding with Toy Surrogate Models in Machine Learning

by Andrés Páez

2024, Minds and Machines, 34, article 45

In the natural and social sciences, it is common to use toy models-extremely simple and highly idealized representations-to understand complex phenomena. Some of the simple surrogate models used to understand opaque machine learning (ML)... more

descriptionView Paper arrow_downwardDownload

Scientific understanding in biomedical research

by Somogy Varga

2024, Synthese

Motivated by a recent trend that advocates a reassessment of the aim of medical science and clinical practice, this paper investigates the epistemic aims of biomedical research. Drawing on contemporary discussions in epistemology and the... more

descriptionView Paper arrow_downwardDownload

Consistency, local stability, and approximation of Shapash explanation

by TELKOMNIKA JOURNAL

2024, TELKOMNIKA Telecommunication Computing Electronics and Control

Consistency, scalability, and local stability properties ensure that a model or method produces reliable and predictable outcomes. The Shapash helps users understand how the model makes its decisions. With machine learning (ML) system,... more

Figure 2. Consistency of explainability methods The consistency metric compares how close the explanations are to each other. It evaluates the similarity of explanations generated from different explainability methods. The similarity between the explanations is determined based on the average distance between the generated explanations. Figure 2 shows the consistency among LIME, SHAP, and ACV.

Figure 1. The schematic diagram of the procedures for the study In the investigation of the consistency, and local stability of Shapash explanation of the RF regressor prediction outcome, this study suggests the use of a random forest repressor (RFR). To explain the prediction outcome of RFR, the study employed LIME, SHAP, and ACV. Figure | highlights the study’s general method. In the evaluation of the explanation generated by these methods, the study used consistency, stability, and approximation.

Figure 3. The praise-wise comparison of consistency between tree and sampling SHAP The consistency metric demonstrated in Figure 2 shows the similarity between explanations generated by different explainability methods. The consistency is determined based on the average distance between the generated explanations by various explainability methods. The explanation between LIME, and SHAP generated a similar explanation (distance=0.35), compared to the ACV explanation (distance=0.43). In conclusion, for this particular sample, SHAP, and LIME are more similar than ACV. Moreover, Figure 3 demonstrates the consistency of the explainability methods pairwise plot for the explanation generated by different explanation methods.

Figure 4. The average distance between multiple explanations

Figure 5. The average distance between explanation

Figure 6. The average distance between multiple explanations

Table 1. Differences in contribution distributed across HD features

descriptionView Paper arrow_downwardDownload

Explainable extreme boosting model for breast cancer diagnosis

by Tsehay Assegie

2024, International Journal of Electrical and Computer Engineering (IJECE)

This study investigates the Shapley additive explanation (SHAP) of the extreme boosting (XGBoost) model for breast cancer diagnosis. The study employed Wisconsin’s breast cancer dataset, characterized by 30 features extracted from an... more

descriptionView Paper arrow_downwardDownload

Explainable extreme boosting model for breast cancer diagnosis

by Lakshmi Tulasi

2024, International Journal of Electrical and Computer Engineering (IJECE)

descriptionView Paper arrow_downwardDownload

Axe the X in XAI: A Plea for Understandable AI

by Andrés Páez

2024, Philosophy of science for machine learning: Core issues and new perspectives

In a recent paper, Erasmus et al. (2021) defend the idea that the ambiguity of the term "explanation" in explainable AI (XAI) can be solved by adopting any of four different extant accounts of explanation in the philosophy of science: the... more

descriptionView Paper arrow_downwardDownload

Explanation in Human-AI Systems: A Literature Meta-Review, Synopsis of Key Ideas and Publications, and Bibliography for Explainable AI

by Shane Mueller

2024, ArXiv

This is an integrative review that address the question, "What makes for a good explanation?" with reference to AI systems. Pertinent literatures are vast. Thus, this review is necessarily selective. That said, most of the key... more

descriptionView Paper arrow_downwardDownload

XAI Handbook: Towards a Unified Framework for Explainable AI

by sheraz ahmed

2024, 2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW)

XAI Handbook: Towards a Unified Framework for Explainable AI Figure 1. Overview of our proposed framework. A task defined in the high-level domain gets an under-specified characterization, leaving out non-functional requirements. “Explanations” are meth- ods that probe for said requirements. Interpretations are mappings from low- to the high-level domain.

to ask what kind of semantic entity an explanation is: is it an action, an outcome, a process or an object? For most textbook definitions (see Table 2), explanations are seen as statements. In turn, such statements are nothing but descrip- tions about an already existing entity (the explanandum or the one which is subject to description). From a functional perspective, an explanation is therefore the process by which an explanandum is described. Finally, to avoid confusions between the process of explaining and its output, we refer to the former as explanation and the latter as the explanans. As we seek explanations for AI models, we find such suit- able facts in the form of low-level mathematical primitives used to build the models themselves: support vector ma- chines have a decision boundary equation, coordinates of the support vectors, the tolerance threshold, etc. A Neural Network has values for each individual parameter, the equa- tions governing how they connect with each other, the value of the cost function when computed on a particular input, the gradient that can be computed on that loss, etc. All of these are concrete, undisputed facts (assuming there are no bugs) that are suitable explanandums.

Table 2. Definitions of “explanation” and “interpretation” according to various dictionaries. Explanation is referred to as an object while interpretation is more commonly associated to an action. Accessed on 05.01.2021. requirements. Asking why something happens, inescapably relates back to causal effects. While these kind of relationships are among the most useful to discover (as it allows for more control over the effect by adjusting the cause), other non-causal relationships remain valuable in the toolset of explainability. Finding out that a model is unfair, without knowing what the cause of it is, can already be helpful in high-stake sce- narios (e.g., by preventing its use). We say that the why relates to the nature of an interpretation. In short, if the interpretation bares a causal meaning, then the why is being defined by the causal link. If the meaning is limited to a correlation, the why is left out of the scope of that particular interpretation. Note that, if the explanation method is al- ready based on causal theory (Lopez-Paz et al., 2017; Chang et al., 2019), the assigned meaning (i.e., the link between the explanans and the high-level (non-)functional requirement) will be more direct and therefore, more likely to withstand scientific scrutiny. The why is therefore not mandatory in explanations generated by XAI methods. In any case, the explanation’s context can and should be defined, be it causal or based solely on correlations. Proponents of XAI methods are responsible for clearly stating the context in which their explanations can be interpreted.

Table 3. Recontextualization of three popular XAI methods with the help of our proposed framework.

descriptionView Paper arrow_downwardDownload

Explainable extreme boosting model for breast cancer diagnosis

by tsehay admassu

2024, International Journal of Electrical and Computer Engineering (IJECE)

descriptionView Paper arrow_downwardDownload

Explainable extreme boosting model for breast cancer diagnosis

by International Journal of Electrical and Computer Engineering (IJECE)

2023, International Journal of Electrical and Computer Engineering (IJECE)

This study investigates the Shapley additive explanation (SHAP) of the extreme boosting (XGBoost) model for breast cancer diagnosis. The study employed Wisconsin's breast cancer dataset, characterized by 30 features extracted from an... more

descriptionView Paper arrow_downwardDownload

What do we want from Explainable Artificial Intelligence (XAI)? – A stakeholder perspective on XAI and a conceptual model guiding interdisciplinary XAI research

by Kevin Baum

2023, Artificial Intelligence

Previous research in Explainable Artificial Intelligence (XAI) suggests that a main aim of explainability approaches is to satisfy specific interests, goals, expectations, needs, and demands regarding artificial systems (we call these... more

Figure 1: Our proposed model of how explainability approaches relate to the satisfaction of stakeholders’ desiderata.

Figure 2: The different classes of stakeholders associated with artificial systems and their relations. stakeholders: users, (system) developers, affected parties, deployers, and regulators (see Figure 2).

Table 1: An exemplary list of desiderata, stakeholders holding these desiderata, and sources that claim, propose, or show that XAI-related research (e.g., on explainability approaches) and its findings and outputs can contribute to the satisfaction of these desiderata. We classify the sources into those that provide no empirical investigation of their claims, those that show empirical evidence (e.g., an explainability approach affected a desideratum’s satisfaction), those that provide mixed empirical evidence (e.g., for some explainability approaches there are positive effects on a given desideratum’s satisfaction, whereas for others there are no effects), and those that present inconclusive empirical evidence (e.g., the effect of an explainability approach on a desideratum’s satisfaction was not sjonificant).

descriptionView Paper arrow_downwardDownload

XAI Handbook: Towards a Unified Framework for Explainable AI

by Sebastian Palacio

2023, 2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW)

descriptionView Paper arrow_downwardDownload

A modelica power system library for phasor time-domain simulation

by Gladys Leon

2023, IEEE PES ISGT Europe 2013

Power system phasor time-domain simulation is often carried out through domain specific tools such as Eurostag, PSS/E, and others. While these tools are efficient, their individual sub-component models and solvers cannot be accessed by... more

descriptionView Paper arrow_downwardDownload

Overcoming Opacity in Machine Learning

by Rafaela Hillerbrand

2022

In order for a machine learning model to be useful, it must be used. Opaque models that predict or classify without explaining are often ignored. Thus measuring the satisfaction of those who receive an explanation is one natural way to... more

descriptionView Paper arrow_downwardDownload

Explanation in Human-AI Systems: A Literature Meta-Review, Synopsis of Key Ideas and Publications, and Bibliography for Explainable AI

by Abigail Emrey

2022, ArXiv

descriptionView Paper arrow_downwardDownload

Opacity and understanding in artificial neural networks: a philosophical perspective

by Martin Justin

2022, Proceedings of the 25th International Multiconference Information Society 2022: Cognitive Science

In the paper, I review some of the emerging philosophical literature on the problem of using artiﬁcial neural networks (ANNs) and deep learning in science. Speciﬁcally, I focus on the problem of opacity in such systems and argue that... more

descriptionView Paper arrow_downwardDownload

Expl(AI)n It to Me -Explainable AI and Information Systems Research

by Wil van der Aalst

2022

The field of Artificial Intelligence has seen dramatic progress over the last 15 years. Using machine learning methods, software systems that automatically learn and improve relationships using digitized experience, researchers and... more

descriptionView Paper arrow_downwardDownload

The Mythos of Model Interpretability

by Zachary Lipton

2022, Queue

Supervised machine-learning models boast remarkable predictive capabilities. But can you trust your model? Will it work in deployment? What else can it tell you about the world?

descriptionView Paper arrow_downwardDownload

Explaining Explanations in AI

by Brent Mittelstadt

2022, Proceedings of the Conference on Fairness, Accountability, and Transparency

Recent work on interpretability in machine learning and AI has focused on the building of simplified models that approximate the true criteria used to make decisions. These models are a useful pedagogical device for teaching trained... more

descriptionView Paper arrow_downwardDownload

Unexplainability and Incomprehensibility of Artificial Intelligence

by Roman Yampolskiy

2022

Explainability and comprehensibility of AI are important requirements for intelligent systems deployed in real-world domains. Users want and frequently need to understand how decisions impacting them are made. Similarly it is important to... more

descriptionView Paper arrow_downwardDownload

What do we want from Explainable Artificial Intelligence (XAI)? – A stakeholder perspective on XAI and a conceptual model guiding interdisciplinary XAI research

by Kevin Baum

2022, Artificial Intelligence

descriptionView Paper arrow_downwardDownload

A multi-component framework for the analysis and design of explainable artificial intelligence

by Shahin Atakishiyev and

2022, ArXiv

The rapid growth of research in explainable artificial intelligence (XAI) follows on two substantial developments. First, the enormous application success of modern machine learning methods, especially deep and reinforcement learning,... more

descriptionView Paper arrow_downwardDownload

Towards Explainable Neural-Symbolic Visual Reasoning

by Adrien Bennetot

2022, arXiv: Learning

Many high-performance models suffer from a lack of interpretability. There has been an increasing influx of work on explainable artificial intelligence (XAI) in order to disentangle what is meant and expected by XAI. Nevertheless, there... more

Figure 1: Proposed neural-symbolic explainable model ex- tended from Doran et al. [2017]: the black box model pro- vides, along with its output, an explanation of its reasoning to highlight bias and improve performance. Our contribution with respect to Doran ef al. [2017] is the way we populate the KB directly from the data and the way we constraint the DNN thanks to the KB. It can be seen with the dashed lines

Table 1: Scenarios for Neural-Symbolic Reasoning, depending on the origin of the KB and the origin of the final output. The cell with text in bold is our contributed proposed model to achieve faithful neural-symbolic visual reasoning.

“l recognize a [person]. | know from the KB that a [person] can be a [man], a [teenager], a [boy] or a [senior]. | succeeded to determine the [person] type so | am confident: this is a [man]”

descriptionView Paper arrow_downwardDownload

XAI Handbook: Towards a Unified Framework for Explainable AI

by Andreas Dengel

2022, ArXiv

The field of explainable AI (XAI) has quickly become a thriving and prolific community. However, a silent, recurrent and acknowledged issue in this area is the lack of consensus regarding its terminology. In particular, each new... more

descriptionView Paper arrow_downwardDownload

XAI Handbook: Towards a Unified Framework for Explainable AI

by Sheraz Ahmed

2022, ArXiv

descriptionView Paper arrow_downwardDownload

What makes a good explanation? Cognitive dimensions of explaining intelligent machines

by Tillman Weyde

2022

Explainability is assumed to be a key factor for the adoption of Artificial Intelligence systems in a wide range of contexts (Hoffman, Mueller, & Klein, 2017; Hoffman, Mueller, Klein, & Litman, 2018; Doran, Schulz, & Besold, 2017; Lipton,... more

descriptionView Paper arrow_downwardDownload

A Multistakeholder Approach Towards Evaluating AI Transparency Mechanisms

by madhulika srikumar

2022, ArXiv

Copyright held by the owner/author(s). CHI’21, May 8-13, 2021, Online Virtual Conference ACM 978-1-4503-6819-3/20/04. https://doi.org/10.1145/3334480.XXXXXXX Abstract Given that there are a variety of stakeholders involved in, and... more

descriptionView Paper arrow_downwardDownload

A multi-component framework for the analysis and design of explainable artificial intelligence

by Randy Goebel

2021, ArXiv

descriptionView Paper arrow_downwardDownload

Benchmarking and Survey of Explanation Methods for Black Box Models

by Francesco Bodria

2021

The widespread adoption of black-box models in Artificial Intelligence has enhanced the need for explanation methods to reveal how these obscure models reach specific decisions. Retrieving explanations is fundamental to unveil possible... more

Fig. 1: Existing taxonomy for the classification of explanation methods.

Fig. 2: TOP: LIME application on the same record for adult (a/b), german (c/d): a/c are the LG model explanation and b/d the CAT model explanation. All the models correctly predicted the output class. BOTTOM: Force plot returned by SHAP explaining XGB on two records of adult: (e), labeled as class 1 (> 504) and, (f), labeled as class 0 (< 504’). Only the features that contributed more (i.e. higher SHAP’s values) to the classification are reported.

Fig. 3: SHAP application on adult: a record labelled > 50K (top-left) and one as < 50K (top-right). They are obtained applying the TreeExplainer on a XGB model and then the decision plot, in which all the input features are shown. At the bottom, the application of SHAP to explain the outcome of a set of record by XGB on adult. The interaction values among the features are reported.

Fig. 4: Explanations of DALEX for two records of adult: b(x) = 0 (< 50) (left), b(~) = 1 (> 50K) (right) to explain an XGB in form of Shapely values (top), break down plots (bottom). The y-axis is the features important, the x-axis the positive/negative contribution.

Fig. 5: TOP: Results of EBM on adult: overall global explanation (left), example of a global expla- nation for education number (right). BOTTOM: Local explanations of EBM on adult: left, a record classified as 1 (> 50K); right a record classified as 0 (< 50).

Fig. 8: Examples of saliency maps obtained with the algorithm exposed in Section 5.1 on various datasets. The first row are the original images of the dataset and on top of them we have the predicted class from the original model.

Fig. 9: Visual Comparison of saliency maps obtained by taking the gradient of the output y w.r.t. the input image x (center) and SMOOTHGRAD (bottom). On the three image in the center the saliency map changes drastically. On all three cases is focusing to the subject of the image completely changing original values. This is true also for the seashore image on the far right.

Fig. 10: (Top) Explanations of DEEP-SHAP on mnist. (Bottom) Explanations of GRAD-SHAP on imagenet.

Fig. 11: Example of Insertion (on the left) and Deletion (on the right) metric computation per- formed on LIME and the hockey image. The area under the curve is 0.2156 for deletion and 0.5941 for Insertion.

Fig. 12: TCAV scores for three concepts: ice, Hockey player, and cheering people (fans) for the class puck of imagenet. On the left the query image; on the center some sample of the image tested in TCAV as concepts, and on the right the histogram of the scores with errors. The hockey players has been classified as a puck, but the saliency maps are very different alongside methods. Here we can see that the ice and the hockey players are important concepts, while the background fans are not significant.

Fig. 13: Criticism (on the left) and prototypes (on the right), output of MMD-CRITIC from cifar. On the criticisms we have a lot of planes on white background, so the sky background is important for the plane.

Fig. 16: Saliency heat-map matrix generated from the method presented in [30]. The row and the columns of the matrix correspond to the words in the sentence ‘Read the book, for- get the movie!”. Each value of the matrix shows the attention weight a,; of the annotation of the i-th word w.r.t. the j-th. Fig.17: Representation of the attention in BERT for a sentence taken from imdb using the visualization of [66]. The greater the attention between two words, the bigger the line. Here is selected only the attention related to the word “sucks”.

Table 3: Comparison on the fidelity and the faithfulness metrics of different explanation methods. For every evaluation we report the mean and the standard deviation over a subset of 50 test set records. Table 4: Comparison on the stability metric. We report the mean and the standard deviation over a subset of 30 test records.

Table 5: Explanation runtime expressed in seconds for explainers of tabular classifiers approximated as order of magnitude. Table 6: Explainers for black-boxes classifying image data sorted by explanation type: Saliency Maps (SM), Concept Attributions (CA), Counterfactuals (CF), and Prototypes (PR). For every method is indicated if is possible it for images (IMG) only, or for ANY type of data, if it is an Intrinsic (IN) or a Post-Hoc (PH) model, Local (L) or Global (G), and if it is model Agnostic (A) or model-Specific (S).

common end-user due to their logical structure and the similarity by example they exploit. This is particularly true in decision rules correlated by counterfactual ones, like in LORE. The end-user can understand why she received that outcome, but she also has a suggestion about what to change to achieve another classification. However, fewer methods are proposed in this context w.r.t. feature importance explanations. In particular, the majority of rule and prototype-based explanators are intrinsic. For the few post-hoc ones, on average, they require more time to provide an explanation w.r.t. feature importance ones. Regarding the post-hoc prototype-based models, there are some interesting approaches, but there is no code for them, highlighting that they are still in an early stage of development. During the past few years, counterfactuals have witnessed a particularly great interest. Overall, even if the rules, prototypes, and counterfactuals seem to be the best solution, there are still several open questions and challenges in this research area such as improving the efficiency and the accuracy of these explanation algorithms as well as considering the constraints of the domain in which the model is being employed.

Table 7: Insertion (left) and deletion (right) metrics expressed as AUC of accuracy vs. percentage of removed/inserted pixels. slowly inserted pixels while substituting with black pixels for deletion. For every substitution we made, we query the image to the black-box, obtaining an accuracy. The final score is obtained by taking the area under the curve (AUC) [62] of accuracy as a function of the percentage of removed pixels. In Figure 11 we have an example of this metric computed on the hockey figure of imagenet. For every dataset, we performed this metric calculation for a set of 100 samples, and then we averaged. The results are shown in Table 7. Insertion scores decrease while augmenting the dataset image dimension because we have higher information and more pixels have to be inserted to higher the accuracy. On the other hand, deletion scores decrease. This fact could be because since we have greater information, it is easier to decrease accuracy. The best methods are highlighted in bold, and we can see that RISE is the best in three out of five experiments. RISE is followed by DEEPLIFT, and €-LRP. Segmentation based methods (LIME, XRAI, GRADCAM, GRADCAM+-+) struggles when using low-resolution images.

Table 8: Explanation runtime expressed in seconds for explainers of image classifiers approximated as order of magnitude. a value for a group of pixels called patches. If the value is positive, a group contributed positively to the prediction. Otherwise, it contributed negatively.

Fig. 15: Example of sentence highlighting, on top we have the score produce by IntGrad and below we have in order, LIME, DeepLift and the baseline which consists of multiplying the input with the gradient w.r.t. input. The sentence is taken from imdb

Table 10: Deletion (right) and Insertion (left) metrics and computed on Sentence Highlighting for different datasets.

descriptionView Paper arrow_downwardDownload

Notions of explainability and evaluation approaches for explainable artificial intelligence

by Giulia Vilone and

2021

Explainable Artificial Intelligence (XAI) has experienced a significant growth over the last few years. This is due to the widespread application of machine learning, particularly deep learning, that has led to the development of highly... more

Fig. 1. Diagrammatic view of Explainable Artificial Intelligence with interaction between methods for explanations and their evaluation approaches

Fig. 2. Proposed classification of the XAI literature with (a) the distribution of published scientific articles over time, (b) the root of our hierarchical classification system representing the main four categories and the percentage of articles in each , and (c) the salient relations between these categories that have emerged.

Fig. 3. Classification of the notions related to the concept of explainability (left) and distribution of the relative scientific studies across categories (right).

Fig. 4. Diagram of the main factors shaping the structure of a machine-generated explanation.

Fig. 5. Classification of the approaches to evaluate methods for explainability (left) and distribution of the relative scientific studies across categories (right

Fig. 6. Diagram of the general process followed by human centred evaluation approaches.

Fig. 7. State of the art (a) and envisioned (b) frameworks for eXplainable Artificial Intelligence.

the following requirements: (I) fidelity - the representation of inputs and nodels in terms of concepts should preserve and present to end-users should be representable with few non-overlapping concepts, and (IID grounding - concepts should have an immediate human-understandable isting a set of characteristics that an explanation should possess:

Classification of the scientific articles proposing comparative approaches to evaluate methods for explainability, classified according to the methodology followed to carry out the comparison task.

List of the methods for explainability evaluated by comparing their output explanations, which are only saliency masks, with the explanations generated by similar methods (listed in the fourth column). These comparisons were carried out over different types of input data (listed in the third column).

List and classification of the scientific articles proposing human-centred approaches to evaluate methods for explainability, classified according to the construction approach, the type of measurement employed (qualitative or quantitative), and the format of their output explanation. (words) whereas the others could only provide more labelled instances (moving more messages to the appropriate folders). At the end of the session, the participants filled a questionnaire to express their opinions on the prototype. In the experiment run in [49,149], participants were invited to interact with a model that predicts whether a person is doing physical exercise or not, based on the body temperature and the pace. They were shown with some examples of inputs and out- puts accompanied by graphical (in the form of decision trees) and textual explanations on the logic followed by the model. Half of the participants were presented with why explanations, such as “Output classified as Not Exercising, because Body Temperature < 37 and Pace < 3” whereas the other half were presented with why not explanations, such as “Output not classified as Exercising, because Pace < 3, but not Body Temperature > 37”. Subsequently, the participants had to fill two questionnaires to check their understanding by asking questions how the system works and to give feedback on the explanations in terms of understandability, trust and usefulness. Both questionnaires contained a mix of open and close questions, where the close ones consisted of a 7-point Likert scale. In a preliminary study [157], the authors showed a set of Kandinsky Patterns to 271 participants who were asked to classify them and give an explanation of their decisions. The scope required to evaluate the explanations of a credit model, trained to accept or reject loan applications, consisting of IF-THEN rules and displayed as a decision tree. They were asked to predict the model’s outcome on a new loan application, answer a few yes/no questions such as “Does the model accept all the people with an age above 60?” and rate, for each question, the degree of confidence in the answer on a scale from 1 (Totally not confident) to 5 (Very confident). The authors measured, besides the answer confidence, other two variables about task performance: accuracy, quantified as the percentage of correct answers, and the time in seconds spent to answer the questions. The effectiveness of why-oriented explanation systems in debugging a naive Bayes learning model for text classification and in context-aware ap- plications were respectively tested in [27,178] and [49,149]. [27,178] asked participants to train a prototype system, based on a Multinomial naive Bayes classifier, that can learn from users how to automatically classify emails by manually moving a few of them from the inbox into an appropriate folder. The system was subsequently run over a new set of messages, some of which were wrongly classified. The participants had to debug the system by asking ‘why’ questions via an interactive explanation tool producing textual answers and by giving two types of feedback: some participants could label the most relevant feature

descriptionView Paper arrow_downwardDownload

A study about Explainable Articial Intelligence: using decision tree to explain SVM

by Luciano A Digiampietri

2021, Revista Brasileira de Computação Aplicada

The technologies supporting Artificial Intelligence (AI) have advanced rapidly over the past few years and AI is becoming a commonplace in every aspect of life like the future of self-driving cars or earlier health diagnosis. For this to... more

descriptionView Paper arrow_downwardDownload

What is Interpretability?

by Adrian Erasmus

2020, Philosophy & Technology

We argue that artificial networks are explainable and offer a novel theory of interpretability. Two sets of conceptual questions are prominent in theoretical engagements with artificial neural networks, especially in the context of... more

While the DN model is good for illustrating the explanation of phenomena which result from deterministic laws, it does not capture the characteristics of probabilis- tic events. In response to this, Hempel (1965) introduced Inductive Statistical (IS) explanation. IS explanation involves the inference of an individual event from a sta- tistical law and empirical information about the event (see Fig. 2). For example, the increased probability of having breast cancer given a mutated BRCA/ gene in con- junction with a particular patient having a mutated BRCA/ gene explains the patient having breast cancer. Here, the relation between the explanandum and the explanans is inductive because all that can be inferred from the information given is there being a higher or lower probability that the patient has breast cancer. If the probability of having breast cancer given that the patient has a mutated BRCA/ gene were lower then even if the patient has breast cancer, this information cannot be used to explain

when ANNs are recurrent, the pathways can always be unrolled and mechanistically depicted. depicted. There are several criticisms of the accounts of explanation described above (see Salmon 1989; Skillings 2015; Godfrey-Smith 2016). While we do not have the space to engage with all of these, there are two that may have implications for the arguments that follow, and are thus worth addressing. First, it may be argued that some of the typically cited challenges facing DN, IS, and CM explanations, such as explanatory asymmetry and explanatory irrelevance, are the result of disregard for, or inadequate treatment of the role of causation in explanation (Woodward 2003). Woodward’s (2003) causal-interventionist account of explanation, according to which successfu explanations exhibit patterns of counterfactual dependence, may help deal with these issues, particularly when it comes to the problem of explanatory relevance. Since we argue in Section 3 that ANNs are explainable using the traditional models described above, it may be argued that our account of interpretability could inherit some o these issues. We agree that the causal-interventionist account may be useful for solv ing some questions about explainability, understandability, and interpretability in A (Paez 2019). Indeed, we describe interpretability processes in Section 4.3 some o which result in understandable explanations which include counterfactual informa- tion, however, the causal-interventionist account includes pragmatic elements that we maintain should not count against a phenomenon’s explainability, namely that an explanation should be “deeper or more satisfying” (Woodward 2003: 190) than those provided by DN, IS, or CM explanations. Of course, we agree that some expla- nations can be more preferable than others given some explanatory virtue, such as simplicity, but we disagree that such virtues should be considered when evaluating the explainability of a phenomenon. Second, proponents of the ontic conception of explanation, according to which

Fig.4 The general structure of total interpretation As noted above, both the interpretans and the interpretandum are explanations; each consists of an explanans, explanation process, and explanandum. A case of total inter- pretation is one in which the interpretans is totally different from the interpretandum. In other words, one in which the explanans, explanation process, and explanandum contained in the former differ in some way from the explanans, explanation process,

Fig.5 A partial interpretation wherein the explanandum remains the same in both the interpretans and interpretandum

Fig. 6 A partial interpretation wherein the process of interpretation used results in a new process of explanation for an explanans and explanandum understanding whatever about x. Another way to put the problem of relating different explananda in a total inter- pretation is by asking what sorts of relationships between phenomena are relevant to scientific explanation: Similar phenomena can sometimes figure in explanations with the same overall structure. For example, explanations of tissue classifications based on radiological whiteness can inform explanations of classifications based on tissue density. Provided we have some justification for substituting radiological whiteness for tissue density, explanations on the basis of radiological whiteness can be totally interpreted by those referring to tissue density instead. The hope being that tissue den- sity provides some understanding that radiological whiteness does not. Of course, the similarity of two phenomena does not imply that there should be any understanding gained about one via explanation of the other—understanding is too psychologi- cally contingent for this—but such relationships are, methodologically, good places to begin the search for understanding. In the context of formal explanations. where our concern is not phenomena

descriptionView Paper arrow_downwardDownload

The Pragmatic Turn in Explainable Artificial Intelligence (XAI)

by Andrés Páez

2019, MInds and Machines, 29(3), 441-459

In this paper I argue that the search for explainable models and interpretable decisions in AI must be reformulated in terms of the broader project of offering a pragmatic and naturalistic account of understanding in AI. Intuitively, the... more

descriptionView Paper arrow_downwardDownload

Explaining Explanations in AI

by Brent Mittelstadt and

2018

Figure 1: An illustration of the different weighting scheme used in fitting linear models. LIME (leftmost) weights all examples differently based on how far they are from the original data point (illustrated by different coloured vertices), while DeepLift (centre) only fits the linear weights to the individual edges closest to the original data point (coloured pink). SHAPE (right) fits weights by averaging over all edges formed by flipping a single variable from off to on (each group averaged together is indicated by a single colour). Each of these different approaches equates to a different assumption as to which samples are most important, and none of them can be said a priori to be better than any of the others.

descriptionView Paper arrow_downwardDownload