论文标题
带有噪声条件最大似然估计的自回归生成模型
Autoregressive Generative Modeling with Noise Conditional Maximum Likelihood Estimation
论文作者
论文摘要
我们对标准最大似然估计(MLE)框架进行了简单的修改。我们最大程度地提高了一个\ textit {噪声条件}的可能性,而不是最大程度地提高数据下数据的单一无条件可能性,该家族由噪声级别连续的数据组成的数据组成。我们发现以这种方式训练的模型对噪声更有力,获得更高的测试可能性并产生更高质量的图像。它们也可以通过基于新颖的分数采样方案从中取样,该方案打击了在自回旋模型中样本生成期间发生的经典\ textit {协方差}问题。将此增强应用应用于自回归图像模型,我们在Imagenet 64x64数据集上获得3.32位,并从Frechet Inpection距离(FID)(FID)(FID)(在CIFAR-10数据集上的37.50到12.09)中实质上提高了生成样品的质量。
We introduce a simple modification to the standard maximum likelihood estimation (MLE) framework. Rather than maximizing a single unconditional likelihood of the data under the model, we maximize a family of \textit{noise conditional} likelihoods consisting of the data perturbed by a continuum of noise levels. We find that models trained this way are more robust to noise, obtain higher test likelihoods, and generate higher quality images. They can also be sampled from via a novel score-based sampling scheme which combats the classical \textit{covariate shift} problem that occurs during sample generation in autoregressive models. Applying this augmentation to autoregressive image models, we obtain 3.32 bits per dimension on the ImageNet 64x64 dataset, and substantially improve the quality of generated samples in terms of the Frechet Inception distance (FID) -- from 37.50 to 12.09 on the CIFAR-10 dataset.