用于局部序列转导的伪三定解码

论文标题

用于局部序列转导的伪三定解码

Pseudo-Bidirectional Decoding for Local Sequence Transduction

论文作者

Zhou, Wangchunshu, Ge, Tao, Xu, Ke

论文摘要

局部序列转导（LST）任务是序列转导任务，在源和目标序列之间存在大量重叠，例如语法误差校正（GEC）以及Spell或OCR校正。以前的工作通常可以使用标准序列到序列（SEQ2SEQ）模型来解决LST任务，该模型从左到右生成输出令牌，并遭受不平衡输出问题的困扰。在本文中，我们提出了一种简单但多才多艺的方法，称为LST任务的简单但通用的方法，称为Pseudo-Bividirectional解码（PBD）。 PBD将源代币与解码器的相应表示形式复制为伪未来的上下文，以使解码器能够参与其双向环境。此外，双向解码方案和LST任务的特征促使我们共享编码器和SEQ2SEQ模型的解码器。提出的PBD方法为解码器提供了右侧上下文信息，并建模了LST任务的电感偏差，从而将参数数量减少了一半，并提供了良好的正则化效果。几个基准数据集的实验结果表明，我们的方法始终提高了LST任务上标准SEQ2SEQ模型的性能。

Local sequence transduction (LST) tasks are sequence transduction tasks where there exists massive overlapping between the source and target sequences, such as Grammatical Error Correction (GEC) and spell or OCR correction. Previous work generally tackles LST tasks with standard sequence-to-sequence (seq2seq) models that generate output tokens from left to right and suffer from the issue of unbalanced outputs. Motivated by the characteristic of LST tasks, in this paper, we propose a simple but versatile approach named Pseudo-Bidirectional Decoding (PBD) for LST tasks. PBD copies the corresponding representation of source tokens to the decoder as pseudo future context to enable the decoder to attends to its bi-directional context. In addition, the bidirectional decoding scheme and the characteristic of LST tasks motivate us to share the encoder and the decoder of seq2seq models. The proposed PBD approach provides right side context information for the decoder and models the inductive bias of LST tasks, reducing the number of parameters by half and providing good regularization effects. Experimental results on several benchmark datasets show that our approach consistently improves the performance of standard seq2seq models on LST tasks.

下载PDF全文

下载文献需遵守相关版权规定

论文标题