正式化对人工智能的信任：人类信任的先决条件，原因和目标

论文标题

正式化对人工智能的信任：人类信任的先决条件，原因和目标

Formalizing Trust in Artificial Intelligence: Prerequisites, Causes and Goals of Human Trust in AI

论文作者

Jacovi, Alon, Marasović, Ana, Miller, Tim, Goldberg, Yoav

论文摘要

信任是人与人工智能之间相互作用的核心组成部分，因为“不正确”的信任水平可能会导致滥用，滥用或废除技术。但是，确切地说，对AI的信任的本质是什么？信任的认知机制的先决条件和目标是什么？我们如何促进它们，或评估它们在给定的互动中是否满足？这项工作旨在回答这些问题。我们讨论了一种受社会学互动信任（即人之间的信任）启发但与不相同的信任模型。该模型取决于用户脆弱性的两个关键属性以及预测AI模型决策影响的能力。我们结合了“合同信任”的形式化，使用户和AI之间的信任是信任，即某些隐式或明确的合同将保持，并正式地对“可信赖性”（与社会学中的可信度概念脱离），并凭借其“有保证”和“无与伦比的”信任的概念。然后，我们将有必要的信任作为内在的推理和外在行为提出可能的原因，并讨论如何设计值得信赖的AI，如何评估信任是否表现出来以及是否有必要。最后，我们使用正式化阐明了信任与XAI之间的联系。

Trust is a central component of the interaction between people and AI, in that 'incorrect' levels of trust may cause misuse, abuse or disuse of the technology. But what, precisely, is the nature of trust in AI? What are the prerequisites and goals of the cognitive mechanism of trust, and how can we promote them, or assess whether they are being satisfied in a given interaction? This work aims to answer these questions. We discuss a model of trust inspired by, but not identical to, sociology's interpersonal trust (i.e., trust between people). This model rests on two key properties of the vulnerability of the user and the ability to anticipate the impact of the AI model's decisions. We incorporate a formalization of 'contractual trust', such that trust between a user and an AI is trust that some implicit or explicit contract will hold, and a formalization of 'trustworthiness' (which detaches from the notion of trustworthiness in sociology), and with it concepts of 'warranted' and 'unwarranted' trust. We then present the possible causes of warranted trust as intrinsic reasoning and extrinsic behavior, and discuss how to design trustworthy AI, how to evaluate whether trust has manifested, and whether it is warranted. Finally, we elucidate the connection between trust and XAI using our formalization.

下载PDF全文

下载文献需遵守相关版权规定

论文标题