论文标题
AI系统“预测”的反事实解释是否偏向于全世界的因果直觉?如果是这样,我们可以为此纠正吗?
Can counterfactual explanations of AI systems' predictions skew lay users' causal intuitions about the world? If so, can we correct for that?
论文作者
论文摘要
反事实(CF)的解释已被用作可解释的AI-Both的解释性模式之一,以提高AI系统的透明度并提供追索权。但是,认知科学和心理学指出,人们经常使用CFS表达因果关系。大多数AI系统只能捕获数据中的关联或相关性,因此将其解释为休闲的关联或相关性是没有合理的。在本文中,我们提出了两个实验(总n = 364),探讨了CF对AI系统预测对外行人对现实世界的因果信念的影响。在实验1中,我们发现,提供有关AI系统预测的CF解释确实(不合理)会影响人们对AI使用的因素/特征的因果信念,并且人们更有可能将其视为现实世界中的因果因素。受到有关错误信息和健康警告消息传递的文献的启发,实验2测试了我们是否可以纠正因果信仰的不合理变化。我们发现指出AI系统捕获相关性,而不一定是因果关系可以减轻CF解释对人们因果信仰的影响。
Counterfactual (CF) explanations have been employed as one of the modes of explainability in explainable AI-both to increase the transparency of AI systems and to provide recourse. Cognitive science and psychology, however, have pointed out that people regularly use CFs to express causal relationships. Most AI systems are only able to capture associations or correlations in data so interpreting them as casual would not be justified. In this paper, we present two experiment (total N = 364) exploring the effects of CF explanations of AI system's predictions on lay people's causal beliefs about the real world. In Experiment 1 we found that providing CF explanations of an AI system's predictions does indeed (unjustifiably) affect people's causal beliefs regarding factors/features the AI uses and that people are more likely to view them as causal factors in the real world. Inspired by the literature on misinformation and health warning messaging, Experiment 2 tested whether we can correct for the unjustified change in causal beliefs. We found that pointing out that AI systems capture correlations and not necessarily causal relationships can attenuate the effects of CF explanations on people's causal beliefs.