自动单词对齐的生成潜在神经模型

论文标题

自动单词对齐的生成潜在神经模型

Generative latent neural models for automatic word alignment

论文作者

Ho, Anh Khoa Ngo, Yvon, François

论文摘要

单词对齐方式可以在平行句子对中的单词之间识别单词之间的翻译对应关系，例如学习双语词典，训练统计机器翻译系统或执行质量估计。最近在各种自然语言处理中使用了变分的自动编码器，以无监督的潜在表示，对语言生成任务有用。在本文中，我们研究了这些模型的单词对齐任务，并提出并评估了香草变量自动编码器的几种演变。我们证明，与Giza ++相比，这些技术可以产生竞争成果，并且与两种语言对的强大神经网络对准系统相比。

Word alignments identify translational correspondences between words in a parallel sentence pair and are used, for instance, to learn bilingual dictionaries, to train statistical machine translation systems or to perform quality estimation. Variational autoencoders have been recently used in various of natural language processing to learn in an unsupervised way latent representations that are useful for language generation tasks. In this paper, we study these models for the task of word alignment and propose and assess several evolutions of a vanilla variational autoencoders. We demonstrate that these techniques can yield competitive results as compared to Giza++ and to a strong neural network alignment system for two language pairs.

下载PDF全文

下载文献需遵守相关版权规定

论文标题