选择性的代币生成几种自然语言生成

论文标题

选择性的代币生成几种自然语言生成

Selective Token Generation for Few-shot Natural Language Generation

论文作者

Jo, Daejin, Kwon, Taehwan, Kim, Eun-Sol, Kim, Sungwoong

论文摘要

具有有限培训数据的自然语言建模是一个具有挑战性的问题，并且许多算法都使用大规模的审计语言模型（PLM），因为它的概括能力很高。其中，在固定的大规模PLM之上结合了特定于任务的适配器的增材学习，已在几次设置中普遍使用。但是，这种增加的适配器仍然很容易忽略PLM的知识，尤其是对于几种自然语言的生成（NLG），因为整个序列通常仅由新训练的适配器生成。因此，在这项工作中，我们基于强化学习（RL）开发了一种新颖的添加剂学习算法，该算法在培训和推理过程中有选择地在任务将军PLM和特定于任务的适配器之间输出语言令牌。对两个发电机的输出令牌选择可以使适配器仅考虑到与任务相关的零件，因此使其更强大，并且在RL培训中更稳定。此外，为了从PLM获取每个几次任务的互补适配器，我们利用一个单独的选择模块，该模块也同时使用RL训练。各种少数NLG任务的实验结果，包括问题回答，数据到文本生成和文本摘要表明，所提出的选择性令牌生成显着优于先前基于PLMS的添加剂学习算法。

Natural language modeling with limited training data is a challenging problem, and many algorithms make use of large-scale pretrained language models (PLMs) for this due to its great generalization ability. Among them, additive learning that incorporates a task-specific adapter on top of the fixed large-scale PLM has been popularly used in the few-shot setting. However, this added adapter is still easy to disregard the knowledge of the PLM especially for few-shot natural language generation (NLG) since an entire sequence is usually generated by only the newly trained adapter. Therefore, in this work, we develop a novel additive learning algorithm based on reinforcement learning (RL) that selectively outputs language tokens between the task-general PLM and the task-specific adapter during both training and inference. This output token selection over the two generators allows the adapter to take into account solely the task-relevant parts in sequence generation, and therefore makes it more robust to overfitting as well as more stable in RL training. In addition, to obtain the complementary adapter from the PLM for each few-shot task, we exploit a separate selecting module that is also simultaneously trained using RL. Experimental results on various few-shot NLG tasks including question answering, data-to-text generation and text summarization demonstrate that the proposed selective token generation significantly outperforms the previous additive learning algorithms based on the PLMs.

下载PDF全文

下载文献需遵守相关版权规定

论文标题