论文标题
您不会看到我的邻居吗?:用户预测,心理模型和基于相似性的AI分类器的说明
Won't you see my neighbor?: User predictions, mental models, and similarity-based explanations of AI classifiers
论文作者
论文摘要
当人类能够预测可能的失败并形成系统如何工作的有用的心理模型时,他们应该能够与基于人工智能的系统更有效地工作。我们使用高性能的图像分类器对人类人工智能系统的人类心理模型进行了研究,重点是参与者预测特定图像的分类结果的能力。参与者在两个类之一中查看了单个标记的图像,然后试图预测分类器是否会正确标记它们。在本实验中,我们探讨了为参与者提供有关图像在空间中最近邻居的更多信息,该空间代表了分类器神经网络下层提取的原本无法解释的功能。我们发现,提供此信息的确会增加参与者的预测性能,并且性能改进可能与邻居图像与目标图像的相似性有关。我们还发现,此信息的介绍可能会影响人们对目标形象的分类 - 也就是说,而不仅仅是拟人化系统,在某些情况下,人类在判断中变得“机械形化”。
Humans should be able work more effectively with artificial intelligence-based systems when they can predict likely failures and form useful mental models of how the systems work. We conducted a study of human's mental models of artificial intelligence systems using a high-performing image classifier, focusing on participants' ability to predict the classification result for a particular image. Participants viewed individual labeled images in one of two classes and then tried to predict whether the classifier would label them correctly. In this experiment we explored the effect of giving participants additional information about an image's nearest neighbors in a space representing the otherwise uninterpretable features extracted by the lower layers of the classifier's neural network. We found that providing this information did increase participants' prediction performance, and that the performance improvement could be related to the neighbor images' similarity to the target image. We also found indications that the presentation of this information may influence people's own classification of the target image -- that is, rather than just anthropomorphizing the system, in some cases the humans become "mechanomorphized" in their judgements.