渐进的长期文字具有验证的语言模型

论文标题

渐进的长期文字具有验证的语言模型

Progressive Generation of Long Text with Pretrained Language Models

论文作者

Tan, Bowen, Yang, Zichao, AI-Shedivat, Maruan, Xing, Eric P., Hu, Zhiting

论文摘要

在大量文本（例如GPT-2）上预测的大规模语言模型（LMS）是强大的开放域文本生成器。但是，正如我们的系统检查所揭示的那样，此类模型生成连贯的文本长段（例如1000个令牌）仍然具有挑战性，尤其是当模型对小型语料库上的目标域进行微调时。以前的计划 - 当代方法也没有在各个领域产生如此长的文本。为了克服局限性，我们提出了一种以渐进式生成文本生成文本的简单但有效的方法，灵感来自从低分辨率到高分辨率的图像。我们的方法首先产生特定于域的内容关键字，然后逐步将它们精炼成多个阶段的完整段落。简单的设计使我们的方法在每个阶段都可以利用验证的LMS，并有效地适应了仅给出一小部分示例的任何目标域。我们通过广泛的评估指标进行了全面的实证研究，并表明我们的方法在质量和样本效率方面对微调的大型LM和各种计划 - 随后的方法有了显着改善。人类评估还验证了我们的模型世代更加连贯。

Large-scale language models (LMs) pretrained on massive corpora of text, such as GPT-2, are powerful open-domain text generators. However, as our systematic examination reveals, it is still challenging for such models to generate coherent long passages of text (e.g., 1000 tokens), especially when the models are fine-tuned to the target domain on a small corpus. Previous planning-then-generation methods also fall short of producing such long text in various domains. To overcome the limitations, we propose a simple but effective method of generating text in a progressive manner, inspired by generating images from low to high resolution. Our method first produces domain-specific content keywords and then progressively refines them into complete passages in multiple stages. The simple design allows our approach to take advantage of pretrained LMs at each stage and effectively adapt to any target domain given only a small set of examples. We conduct a comprehensive empirical study with a broad set of evaluation metrics, and show that our approach significantly improves upon the fine-tuned large LMs and various planning-then-generation methods in terms of quality and sample efficiency. Human evaluation also validates that our model generations are more coherent.

下载PDF全文

下载文献需遵守相关版权规定

论文标题