Computing contrastive, counterfactual explanations for Bayesian networks
Summary
In recent years it is explored how counterfactual statements can be used to explain the value of a specific target variable of several Artificial Intelligence systems. However, how counterfactual statements can be used to explain a Bayesian Network (BN) is a relatively unexplored topic. Because people generally prefer an explanation where one value is contrasted against another, we want to give an explanation that is both contrastive and counterfactual to explain a certain target variable in a BN. After giving a definition for a contrastive, counterfactual explanation and giving a naive approach of computing all explanations, we first constructed an algorithm to more efficiently find all explanations in an enhanced subset lattice for evidence variables that are binary valued. Secondly we explored how a monotonicity relation between the evidence and the target can be exploited to more efficiently compute all explanations for evidence that is not binary valued. We were able to derive several propositions about the inclusion of evidence variables in an explanation based on the respective ordering of the observed value for the evidence variable and the most probable and expected values for the target. With these propositions we constructed an algorithm that finds all explanations with a breadth-first search through the monotonicity enhanced subset lattice. We concluded our research by providing two different methods of selecting those explanations that are most useful for a user and giving different templates of how the explanations can be presented to the user in a textual way.