从几何感知的角度来改善语言模型的鲁棒性

论文标题

从几何感知的角度来改善语言模型的鲁棒性

Improving robustness of language models from a geometry-aware perspective

论文作者

Zhu, Bin, Gu, Zhaoquan, Wang, Le, Chen, Jinyin, Xuan, Qi

论文摘要

最近的研究发现，在对抗训练中消除规范的投影和不断增加的搜索步骤可以显着提高鲁棒性。但是，我们观察到太多的搜索步骤可能会损害准确性。我们旨在使用更少的步骤有效地获得强大的鲁棒性。通过玩具实验，我们发现将干净的数据扰动到决策边界，但没有越过它不会降低测试准确性。受此启发，我们提出了友好的对抗数据增强（FADA）来生成友好的对抗数据。除了FADA之外，我们提出了几何学意识的对抗训练（GAT），以对友好的对抗数据进行对抗性培训，以便我们可以节省大量的搜索步骤。在两个广泛使用的数据集和三个预训练的语言模型上进行的全面实验表明，GAT可以通过更少的步骤获得更强的鲁棒性。此外，我们还提供了广泛的经验结果和对鲁棒性的深入分析，以促进未来的研究。

Recent studies have found that removing the norm-bounded projection and increasing search steps in adversarial training can significantly improve robustness. However, we observe that a too large number of search steps can hurt accuracy. We aim to obtain strong robustness efficiently using fewer steps. Through a toy experiment, we find that perturbing the clean data to the decision boundary but not crossing it does not degrade the test accuracy. Inspired by this, we propose friendly adversarial data augmentation (FADA) to generate friendly adversarial data. On top of FADA, we propose geometry-aware adversarial training (GAT) to perform adversarial training on friendly adversarial data so that we can save a large number of search steps. Comprehensive experiments across two widely used datasets and three pre-trained language models demonstrate that GAT can obtain stronger robustness via fewer steps. In addition, we provide extensive empirical results and in-depth analyses on robustness to facilitate future studies.

下载PDF全文

下载文献需遵守相关版权规定

论文标题