论文标题
顺序模型的拟合优度的内核Stein测试
A kernel Stein test of goodness of fit for sequential models
论文作者
论文摘要
我们提出了一种拟合度量度,以实现具有不同维度的概率密度建模观测值,例如不同长度或可变长度序列的文本文档。拟议的度量是核Stein差异(KSD)的一个实例,该实例已用于构建非均衡密度的拟合优度测试。 KSD由其Stein操作员定义:用于测试的当前操作员应用于固定维空间。作为我们的主要贡献,我们通过识别适当的Stein操作员将KSD扩展到可变维度设置,并提出一种新型的KSD拟合度测试。与以前的变体一样,提出的KSD不需要归一化的密度,从而可以评估大量模型。我们的测试证明在离散的顺序数据基准中表现良好。
We propose a goodness-of-fit measure for probability densities modeling observations with varying dimensionality, such as text documents of differing lengths or variable-length sequences. The proposed measure is an instance of the kernel Stein discrepancy (KSD), which has been used to construct goodness-of-fit tests for unnormalized densities. The KSD is defined by its Stein operator: current operators used in testing apply to fixed-dimensional spaces. As our main contribution, we extend the KSD to the variable-dimension setting by identifying appropriate Stein operators, and propose a novel KSD goodness-of-fit test. As with the previous variants, the proposed KSD does not require the density to be normalized, allowing the evaluation of a large class of models. Our test is shown to perform well in practice on discrete sequential data benchmarks.