中毒对人工智能的攻击和防御：一项调查

论文标题

中毒对人工智能的攻击和防御：一项调查

Poisoning Attacks and Defenses on Artificial Intelligence: A Survey

论文作者

Ramirez, Miguel A., Kim, Song-Kyoo, Hamadi, Hussam Al, Damiani, Ernesto, Byon, Young-Ji, Kim, Tae-Yeon, Cho, Chung-Suk, Yeun, Chan Yeob

论文摘要

机器学习模型已在多个领域被广泛采用。但是，最近的研究表明，从攻击中产生了一些漏洞，可能会危及该模型的完整性，从而在网络安全方面呈现了新的研究机会窗口。该调查的主要目的是突出机器学习（ML）分类器中与安全漏洞相关的最相关信息；更具体地说，针对针对数据中毒攻击的训练程序，代表了一种攻击，该攻击包括在训练阶段篡改馈送到模型的数据样本，从而导致在推理阶段的模型准确性降解。这项工作汇编了最新的现有文献中发现的最相关的见解和发现，以解决这种类型的攻击。此外，本文还涵盖了几种防御技术，这些技术有望可行的检测和缓解机制，能够为攻击者提供一定程度的鲁棒性。对审查的作品进行了详尽的评估，比较了数据中毒对现实世界中多种ML模型的影响，从而进行了定量和定性分析。本文分析了每种方法的主要特征，包括性能成功指标，所需的超参数和部署复杂性。此外，本文强调了攻击者和辩护人所考虑的基本假设和局限性以及其内在属性，例如：可用性，可靠性，隐私，问责制，解释性等。最后，本文通过参考了一些现有的现有研究趋势，这些研究趋势可以为网络机构领域的未来研究方向提供一些途径。

Machine learning models have been widely adopted in several fields. However, most recent studies have shown several vulnerabilities from attacks with a potential to jeopardize the integrity of the model, presenting a new window of research opportunity in terms of cyber-security. This survey is conducted with a main intention of highlighting the most relevant information related to security vulnerabilities in the context of machine learning (ML) classifiers; more specifically, directed towards training procedures against data poisoning attacks, representing a type of attack that consists of tampering the data samples fed to the model during the training phase, leading to a degradation in the models accuracy during the inference phase. This work compiles the most relevant insights and findings found in the latest existing literatures addressing this type of attacks. Moreover, this paper also covers several defense techniques that promise feasible detection and mitigation mechanisms, capable of conferring a certain level of robustness to a target model against an attacker. A thorough assessment is performed on the reviewed works, comparing the effects of data poisoning on a wide range of ML models in real-world conditions, performing quantitative and qualitative analyses. This paper analyzes the main characteristics for each approach including performance success metrics, required hyperparameters, and deployment complexity. Moreover, this paper emphasizes the underlying assumptions and limitations considered by both attackers and defenders along with their intrinsic properties such as: availability, reliability, privacy, accountability, interpretability, etc. Finally, this paper concludes by making references of some of main existing research trends that provide pathways towards future research directions in the field of cyber-security.

下载PDF全文

下载文献需遵守相关版权规定

论文标题