论文标题
f^2-Softmax:通过频率分解软件多样化神经文本生成
F^2-Softmax: Diversifying Neural Text Generation via Frequency Factorized Softmax
论文作者
论文摘要
尽管神经文本产生最近取得了进步,但编码人类语言丰富的多样性仍然难以捉摸。我们认为,次优的文本生成主要归因于不平衡的令牌分布,这在接受最大样本目标训练时特别误导了学习模型。作为一种简单而有效的补救措施,我们提出了两种新颖的方法,即F^2-Softmax和Mefmax,即使频率分布偏斜,也可以进行平衡的训练。 Mefmax将令牌唯一分配给频率类,试图以相似的频率分组令牌,并在类之间均衡频率质量。然后,f^2-Softmax将目标令牌分解为(i)频率类的两个条件概率的产物,以及(ii)对目标频率类别的概率分布。模型学习了更多统一的概率分布,因为它们仅限于词汇的子集。七个相关指标的巨大性能提高表明,我们的方法不仅可以提高多样性,而且还提高了生成的文本质量。
Despite recent advances in neural text generation, encoding the rich diversity in human language remains elusive. We argue that the sub-optimal text generation is mainly attributable to the imbalanced token distribution, which particularly misdirects the learning model when trained with the maximum-likelihood objective. As a simple yet effective remedy, we propose two novel methods, F^2-Softmax and MefMax, for a balanced training even with the skewed frequency distribution. MefMax assigns tokens uniquely to frequency classes, trying to group tokens with similar frequencies and equalize frequency mass between the classes. F^2-Softmax then decomposes a probability distribution of the target token into a product of two conditional probabilities of (i) frequency class, and (ii) token from the target frequency class. Models learn more uniform probability distributions because they are confined to subsets of vocabularies. Significant performance gains on seven relevant metrics suggest the supremacy of our approach in improving not only the diversity but also the quality of generated texts.