论文标题
Felix:通过标记和插入的灵活文本编辑
Felix: Flexible Text Editing Through Tagging and Insertion
论文作者
论文摘要
我们提出了Felix ---一种灵活的文本编辑方法,旨在从解码双向环境和自我监督的预训练中获得最大收益。与常规序列到序列(SEQ2SEQ)模型相反,Felix在低资源设置中有效,并且在推理时间快速,同时能够对灵活的输入输出转换进行建模。我们通过将文本编辑任务分解为两个子任务来实现这一目标:标签以决定输入令牌的子集及其在输出文本中的顺序,并插入以在输入中不存在输出中的丢失令牌中填写丢失的令牌。标记模型采用新颖的指针机制,而插入模型基于蒙版语言模型。这两种模型都被选为非自动入口,以确保更快的推断。与最近的文本编辑方法相比,Felix在四个NLG任务上进行评估时,表现出色:句子融合,机器翻译自动后编辑,摘要和文本简化。
We present Felix --- a flexible text-editing approach for generation, designed to derive the maximum benefit from the ideas of decoding with bi-directional contexts and self-supervised pre-training. In contrast to conventional sequence-to-sequence (seq2seq) models, Felix is efficient in low-resource settings and fast at inference time, while being capable of modeling flexible input-output transformations. We achieve this by decomposing the text-editing task into two sub-tasks: tagging to decide on the subset of input tokens and their order in the output text and insertion to in-fill the missing tokens in the output not present in the input. The tagging model employs a novel Pointer mechanism, while the insertion model is based on a Masked Language Model. Both of these models are chosen to be non-autoregressive to guarantee faster inference. Felix performs favourably when compared to recent text-editing methods and strong seq2seq baselines when evaluated on four NLG tasks: Sentence Fusion, Machine Translation Automatic Post-Editing, Summarization, and Text Simplification.