论文标题
通过双重往返翻译产生真实的对抗示例,超越意义的示例
Generating Authentic Adversarial Examples beyond Meaning-preserving with Doubly Round-trip Translation
论文作者
论文摘要
通过释放具有保留意义的限制,生成具有单往往返翻译(RTT)神经机译(NMT)的对抗性示例(NMT)已实现了有希望的结果。但是,这种方法的潜在陷阱是,我们无法决定生成的示例是否对目标NMT模型或辅助向后的示例是对的,因为通过RTT的重建误差可以与两者相关。为了解决这个问题,我们提出了基于双重往返翻译(DRTT)的NMT对抗示例的新标准。具体来说,除了源目标源RTT外,我们还考虑了目标源目标,该目标量目标可用于选择目标NMT模型的真实对抗示例。此外,为了增强NMT模型的鲁棒性,我们介绍了蒙版的语言模型来构建基于DRTT的双语对抗对,该对drtt用于直接训练NMT模型。对干净和嘈杂测试集的广泛实验(包括人工噪声)都表明,我们的方法显着提高了NMT模型的鲁棒性。
Generating adversarial examples for Neural Machine Translation (NMT) with single Round-Trip Translation (RTT) has achieved promising results by releasing the meaning-preserving restriction. However, a potential pitfall for this approach is that we cannot decide whether the generated examples are adversarial to the target NMT model or the auxiliary backward one, as the reconstruction error through the RTT can be related to either. To remedy this problem, we propose a new criterion for NMT adversarial examples based on the Doubly Round-Trip Translation (DRTT). Specifically, apart from the source-target-source RTT, we also consider the target-source-target one, which is utilized to pick out the authentic adversarial examples for the target NMT model. Additionally, to enhance the robustness of the NMT model, we introduce the masked language models to construct bilingual adversarial pairs based on DRTT, which are used to train the NMT model directly. Extensive experiments on both the clean and noisy test sets (including the artificial and natural noise) show that our approach substantially improves the robustness of NMT models.