论文标题
通过SBETA进行单纯群集,并应用于在线调整黑盒预测的应用
Simplex Clustering via sBeta with Applications to Online Adjustment of Black-Box Predictions
论文作者
论文摘要
我们探索了深神经网络的软磁预测的聚类,并引入了一种新型的概率聚类方法,称为k-sbetas。在聚类离散分布的一般环境中,现有方法着重于探索针对单纯形数据(例如KL Divergence)量身定制的失真度量,作为标准欧几里得距离的替代方法。我们为聚类分布提供了一般的最大后验(MAP)透视图,这强调了基于现有基于失真的方法的统计模型可能不够描述。取而代之的是,我们优化了一个可混合变量的客观,以测量每个群集中的数据一致性到引入的SBETA密度函数,其参数受到约束并与二进制分配变量共同估算。我们的多功能公式近似于用于建模单纯形数据的各种参数密度,并可以控制集群平衡偏置。这可以在各种情况下无监督的黑框模型预测进行高度竞争性的表演。我们与现有的单纯形 - 群集方法的代码和比较以及我们引入的SoftMax-Prediction Benchmarks公开可用:https://github.com/fchiaroni/clustering_softmax_predictions。
We explore clustering the softmax predictions of deep neural networks and introduce a novel probabilistic clustering method, referred to as k-sBetas. In the general context of clustering discrete distributions, the existing methods focused on exploring distortion measures tailored to simplex data, such as the KL divergence, as alternatives to the standard Euclidean distance. We provide a general maximum a posteriori (MAP) perspective of clustering distributions, emphasizing that the statistical models underlying the existing distortion-based methods may not be descriptive enough. Instead, we optimize a mixed-variable objective measuring data conformity within each cluster to the introduced sBeta density function, whose parameters are constrained and estimated jointly with binary assignment variables. Our versatile formulation approximates various parametric densities for modeling simplex data and enables the control of the cluster-balance bias. This yields highly competitive performances for the unsupervised adjustment of black-box model predictions in various scenarios. Our code and comparisons with the existing simplex-clustering approaches and our introduced softmax-prediction benchmarks are publicly available: https://github.com/fchiaroni/Clustering_Softmax_Predictions.