对抗语言理解的对抗性自我注意

论文标题

对抗语言理解的对抗性自我注意

Adversarial Self-Attention for Language Understanding

论文作者

Wu, Hongqiu, Ding, Ruixue, Zhao, Hai, Xie, Pengjun, Huang, Fei, Zhang, Min

论文摘要

深层神经模型（例如变压器）自然地学习虚假特征，从而在标签和输入之间创建``快捷方式''，从而损害了概括和鲁棒性。本文将自我注意的机制推向了其强大的变体，用于基于变压器的预训练的语言模型（例如BERT）。我们提出\ textIt {对抗自我注意力}机制（ASA），该机制在对抗性上偏向于关注，以有效地抑制模型对特征（例如特定关键字）的依赖并鼓励其对更广泛语义的探索。我们对预训练和微调阶段进行了广泛的任务进行全面评估。对于预训练，与较长的步骤相比，ASA与幼稚的训练相比，展现了显着的性能。为了进行微调，考虑到概括和稳健性，ASA授权的模型大于天真的模型。

Deep neural models (e.g. Transformer) naturally learn spurious features, which create a ``shortcut'' between the labels and inputs, thus impairing the generalization and robustness. This paper advances the self-attention mechanism to its robust variant for Transformer-based pre-trained language models (e.g. BERT). We propose \textit{Adversarial Self-Attention} mechanism (ASA), which adversarially biases the attentions to effectively suppress the model reliance on features (e.g. specific keywords) and encourage its exploration of broader semantics. We conduct a comprehensive evaluation across a wide range of tasks for both pre-training and fine-tuning stages. For pre-training, ASA unfolds remarkable performance gains compared to naive training for longer steps. For fine-tuning, ASA-empowered models outweigh naive models by a large margin considering both generalization and robustness.

下载PDF全文

下载文献需遵守相关版权规定

论文标题