论文标题
pinitiongan:一种具有完全随机初始化的语言gan
InitialGAN: A Language GAN with Completely Random Initialization
论文作者
论文摘要
通过最大似然估计(MLE)训练的文本生成模型遭受了臭名昭著的暴露偏见问题,而生成的对抗网络(GAN)被证明具有解决此问题的潜力。现有的语言gans采用估计器,例如增强或连续放松来模拟单词概率。此类估计器的固有局限性导致当前模型依赖于训练技术(MLE预训练或预训练的嵌入)。但是,由于它们在以前的尝试中的性能较差,因此很少探索没有这些局限性的代表性建模方法。我们的分析表明,无效的采样方法和不健康的梯度是这种不令人满意的性能的主要因素。在这项工作中,我们提出了两种解决这些问题的技术:辍学抽样和完全归一化的LSTM。基于这两种技术,我们提出了一个初始gan,其参数是完整初始初始初始初始初始初始初始初始初始的。此外,我们引入了一个新的评估指标,覆盖率最少,以更好地评估生成的样品的质量。实验结果表明,Initialgan的表现都优于MLE和其他比较模型。据我们所知,这是语言GAN首次在不使用任何预训练技术的情况下胜过MLE。
Text generative models trained via Maximum Likelihood Estimation (MLE) suffer from the notorious exposure bias problem, and Generative Adversarial Networks (GANs) are shown to have potential to tackle this problem. Existing language GANs adopt estimators like REINFORCE or continuous relaxations to model word probabilities. The inherent limitations of such estimators lead current models to rely on pre-training techniques (MLE pre-training or pre-trained embeddings). Representation modeling methods which are free from those limitations, however, are seldomly explored because of their poor performance in previous attempts. Our analyses reveal that invalid sampling methods and unhealthy gradients are the main contributors to such unsatisfactory performance. In this work, we present two techniques to tackle these problems: dropout sampling and fully normalized LSTM. Based on these two techniques, we propose InitialGAN whose parameters are randomly initialized in full. Besides, we introduce a new evaluation metric, Least Coverage Rate, to better evaluate the quality of generated samples. The experimental results demonstrate that InitialGAN outperforms both MLE and other compared models. To the best of our knowledge, it is the first time a language GAN can outperform MLE without using any pre-training techniques.