论文标题
机器的调查
A Survey of Machine Unlearning
论文作者
论文摘要
如今,计算机系统拥有大量的个人数据。然而,尽管如此丰富的数据允许在人工智能中,尤其是机器学习(ML)的突破,但其存在可能会威胁到用户隐私,并且可以削弱人类与人工智能之间的信任纽带。最近的法规现在要求,应要求,必须从计算机系统和ML模型中删除有关用户的私人信息,即``被遗忘的权利'')。虽然从后端数据库中删除数据应该很简单,但在AI上下文中这还不够,因为ML模型经常“记住”旧数据。现代对受过训练模型的对抗性攻击已经证明,我们可以了解训练数据是否属于实例或属性。这种现象要求采用新的范式,即机器学习,以使ML模型忘记了特定的数据。事实证明,由于缺乏共同的框架和资源,最近在机器上进行的工作无法完全解决该问题。因此,本文愿意对机器学习的概念,场景,方法和应用进行全面检查。具体而言,作为尖端研究的类别集合,本文背后的目的是为研究人员和从业人员提供综合资源,以寻求介绍机器的学习及其表述,设计标准,删除请求,算法和应用程序。此外,我们旨在强调尚未使用机器学习但可能会从中受益匪浅的关键发现,当前趋势和新的研究领域。我们希望这项调查是ML研究人员和寻求创新隐私技术的研究人员的宝贵资源。我们的资源可在https://github.com/tamlhp/awesome-machine-unlearning上公开获取。
Today, computer systems hold large amounts of personal data. Yet while such an abundance of data allows breakthroughs in artificial intelligence, and especially machine learning (ML), its existence can be a threat to user privacy, and it can weaken the bonds of trust between humans and AI. Recent regulations now require that, on request, private information about a user must be removed from both computer systems and from ML models, i.e. ``the right to be forgotten''). While removing data from back-end databases should be straightforward, it is not sufficient in the AI context as ML models often `remember' the old data. Contemporary adversarial attacks on trained models have proven that we can learn whether an instance or an attribute belonged to the training data. This phenomenon calls for a new paradigm, namely machine unlearning, to make ML models forget about particular data. It turns out that recent works on machine unlearning have not been able to completely solve the problem due to the lack of common frameworks and resources. Therefore, this paper aspires to present a comprehensive examination of machine unlearning's concepts, scenarios, methods, and applications. Specifically, as a category collection of cutting-edge studies, the intention behind this article is to serve as a comprehensive resource for researchers and practitioners seeking an introduction to machine unlearning and its formulations, design criteria, removal requests, algorithms, and applications. In addition, we aim to highlight the key findings, current trends, and new research areas that have not yet featured the use of machine unlearning but could benefit greatly from it. We hope this survey serves as a valuable resource for ML researchers and those seeking to innovate privacy technologies. Our resources are publicly available at https://github.com/tamlhp/awesome-machine-unlearning.