论文标题
有限变化的多类分类器的样本复杂性结果
Sample Complexity Result for Multi-category Classifiers of Bounded Variation
论文作者
论文摘要
当根据截断的铰链损耗函数定义这些性能时,我们通过经验L1范围覆盖数来控制多类分类器的经验和概括性能之间均匀偏差的概率。对多类别分类器实现的函数的唯一假设是它们具有有限的变化(BV)。对于此类分类器,我们得出了足够的样本量估计值,以使上述性能以很高的可能性接近。特别是,我们对此估计的依赖性对类别的数量C感兴趣。为此,首先,我们在r^d上定义的BV函数集的脂肪震动尺寸限制了比例敏感的版本,该尺寸为O(1/epsilon^d)定义,因为比例EPSILON转至零。其次,我们在C方面为脂肪崩溃的维度提供了更清晰的分解结果,该尺寸对于BV函数集,它从O(C^(d/2 +1))到O(Cln^2(c))提供了改进。然后,这种改进传播了样品复杂性估计。
We control the probability of the uniform deviation between empirical and generalization performances of multi-category classifiers by an empirical L1 -norm covering number when these performances are defined on the basis of the truncated hinge loss function. The only assumption made on the functions implemented by multi-category classifiers is that they are of bounded variation (BV). For such classifiers, we derive the sample size estimate sufficient for the mentioned performances to be close with high probability. Particularly, we are interested in the dependency of this estimate on the number C of classes. To this end, first, we upper bound the scale-sensitive version of the VC-dimension, the fat-shattering dimension of sets of BV functions defined on R^d which gives a O(1/epsilon^d ) as the scale epsilon goes to zero. Secondly, we provide a sharper decomposition result for the fat-shattering dimension in terms of C, which for sets of BV functions gives an improvement from O(C^(d/2 +1)) to O(Cln^2(C)). This improvement then propagates to the sample complexity estimate.