论文标题
添加剂干预措施产生强大的多域机翻译模型
Additive Interventions Yield Robust Multi-Domain Machine Translation Models
论文作者
论文摘要
添加剂干预措施是一种最近提供的机制,用于控制神经机器翻译中的目标端属性。与操纵原始源序列的基于标签的方法相反,干预措施通过直接调节序列中所有令牌的编码器表示来起作用。我们研究了添加剂干预措施在大规模多域机翻译设置中的作用,并在各种推理方案中进行了比较。我们发现,尽管基于干预的系统和基于标签的系统之间的性能差异很小,但当域标签与测试域匹配时,基于干预的系统对于标签错误而言是可靠的,使它们成为标签不确定性下的有吸引力的选择。此外,我们发现,当训练数据大小缩放时,单域微调的优势受到质疑,与以前的发现相矛盾。
Additive interventions are a recently-proposed mechanism for controlling target-side attributes in neural machine translation. In contrast to tag-based approaches which manipulate the raw source sequence, interventions work by directly modulating the encoder representation of all tokens in the sequence. We examine the role of additive interventions in a large-scale multi-domain machine translation setting and compare its performance in various inference scenarios. We find that while the performance difference is small between intervention-based systems and tag-based systems when the domain label matches the test domain, intervention-based systems are robust to label error, making them an attractive choice under label uncertainty. Further, we find that the superiority of single-domain fine-tuning comes under question when training data size is scaled, contradicting previous findings.