论文标题
TextGail:文本生成的生成对抗性模仿学习
TextGAIL: Generative Adversarial Imitation Learning for Text Generation
论文作者
论文摘要
生成的对抗网络(GAN)最近受到了许多批评,因为它们的表现比MLE较差。我们怀疑以前的文本甘斯的劣等性能是由于其歧视者缺乏可靠的指导信号。为了解决这个问题,我们为文本生成提出了一个生成的对抗性模仿学习框架,该框架使用大型预训练的语言模型来提供更可靠的奖励指导。我们的方法使用对比歧视者和近端政策优化(PPO)来稳定和提高文本生成性能。为了进行评估,我们对各种无条件和有条件的文本生成任务进行实验。实验结果表明,与MLE基线相比,TextGail在质量和多样性方面取得更好的性能。我们还验证了我们的直觉,即TextGail的歧视者展示了提供合理奖励的能力。
Generative Adversarial Networks (GANs) for text generation have recently received many criticisms, as they perform worse than their MLE counterparts. We suspect previous text GANs' inferior performance is due to the lack of a reliable guiding signal in their discriminators. To address this problem, we propose a generative adversarial imitation learning framework for text generation that uses large pre-trained language models to provide more reliable reward guidance. Our approach uses contrastive discriminator, and proximal policy optimization (PPO) to stabilize and improve text generation performance. For evaluation, we conduct experiments on a diverse set of unconditional and conditional text generation tasks. Experimental results show that TextGAIL achieves better performance in terms of both quality and diversity than the MLE baseline. We also validate our intuition that TextGAIL's discriminator demonstrates the capability of providing reasonable rewards with an additional task.