NLP中对抗防御和鲁棒性的调查

论文标题

NLP中对抗防御和鲁棒性的调查

A Survey of Adversarial Defences and Robustness in NLP

论文作者

Goyal, Shreya, Doddapaneni, Sumanth, Khapra, Mitesh M., Ravindran, Balaraman

论文摘要

在过去的几年中，越来越明显的是，深层神经网络不足以承受输入数据中的对抗性扰动，从而使它们容易受到攻击。各种作者提出了针对计算机视觉和自然语言处理（NLP）任务的强烈对抗性攻击。作为回应，还提出了许多防御机制，以防止这些网络失败。捍卫神经网络免受对抗性攻击的重要性在于确保即使输入数据受到干扰，模型的预测也保持不变。已经提出了NLP中对抗性防御的几种方法，可满足不同的NLP任务，例如文本分类，命名实体识别和自然语言推断。其中一些方法不仅捍卫神经网络免受对抗攻击的侵害，而且还可以在训练过程中起正规化机制，从而使模型免于过度拟合。这项调查旨在通过引入新颖的分类法来回顾过去几年中NLP中针对对抗性防御的各种方法。该调查还强调了NLP中先进的深神经网络的脆弱性以及捍卫它们所涉及的挑战。

In the past few years, it has become increasingly evident that deep neural networks are not resilient enough to withstand adversarial perturbations in input data, leaving them vulnerable to attack. Various authors have proposed strong adversarial attacks for computer vision and Natural Language Processing (NLP) tasks. As a response, many defense mechanisms have also been proposed to prevent these networks from failing. The significance of defending neural networks against adversarial attacks lies in ensuring that the model's predictions remain unchanged even if the input data is perturbed. Several methods for adversarial defense in NLP have been proposed, catering to different NLP tasks such as text classification, named entity recognition, and natural language inference. Some of these methods not only defend neural networks against adversarial attacks but also act as a regularization mechanism during training, saving the model from overfitting. This survey aims to review the various methods proposed for adversarial defenses in NLP over the past few years by introducing a novel taxonomy. The survey also highlights the fragility of advanced deep neural networks in NLP and the challenges involved in defending them.

下载PDF全文

下载文献需遵守相关版权规定

论文标题