Felix：通过标记和插入的灵活文本编辑

论文标题

Felix：通过标记和插入的灵活文本编辑

Felix: Flexible Text Editing Through Tagging and Insertion

论文作者

Mallinson, Jonathan, Severyn, Aliaksei, Malmi, Eric, Garrido, Guillermo

论文摘要

我们提出了Felix ---一种灵活的文本编辑方法，旨在从解码双向环境和自我监督的预训练中获得最大收益。与常规序列到序列（SEQ2SEQ）模型相反，Felix在低资源设置中有效，并且在推理时间快速，同时能够对灵活的输入输出转换进行建模。我们通过将文本编辑任务分解为两个子任务来实现这一目标：标签以决定输入令牌的子集及其在输出文本中的顺序，并插入以在输入中不存在输出中的丢失令牌中填写丢失的令牌。标记模型采用新颖的指针机制，而插入模型基于蒙版语言模型。这两种模型都被选为非自动入口，以确保更快的推断。与最近的文本编辑方法相比，Felix在四个NLG任务上进行评估时，表现出色：句子融合，机器翻译自动后编辑，摘要和文本简化。

We present Felix --- a flexible text-editing approach for generation, designed to derive the maximum benefit from the ideas of decoding with bi-directional contexts and self-supervised pre-training. In contrast to conventional sequence-to-sequence (seq2seq) models, Felix is efficient in low-resource settings and fast at inference time, while being capable of modeling flexible input-output transformations. We achieve this by decomposing the text-editing task into two sub-tasks: tagging to decide on the subset of input tokens and their order in the output text and insertion to in-fill the missing tokens in the output not present in the input. The tagging model employs a novel Pointer mechanism, while the insertion model is based on a Masked Language Model. Both of these models are chosen to be non-autoregressive to guarantee faster inference. Felix performs favourably when compared to recent text-editing methods and strong seq2seq baselines when evaluated on four NLG tasks: Sentence Fusion, Machine Translation Automatic Post-Editing, Summarization, and Text Simplification.

下载PDF全文

下载文献需遵守相关版权规定

论文标题