论文标题
通过硬质阈值正规化k均值
Regularized K-means through hard-thresholding
论文作者
论文摘要
我们研究了一个基于对集群中心规模的直接惩罚的正规化$ k $ -MEANS方法的框架。通过仿真和理论分析来考虑和比较不同的惩罚策略。根据结果,我们提出了ht $ k $ -means,该ht $ k $ - 米恩斯使用$ \ ell_0 $罚款来诱导变量中的稀疏性。讨论并比较选择调谐参数的不同技术。在广泛的仿真研究中,该提出的方法与最受欢迎的正规$ K $ -MEANS方法相吻合。最后,将ht $ k $ -Means应用于几个真实的数据示例。在这些示例中介绍并使用了图形显示,以获得对数据集的更多信息。
We study a framework of regularized $K$-means methods based on direct penalization of the size of the cluster centers. Different penalization strategies are considered and compared through simulation and theoretical analysis. Based on the results, we propose HT $K$-means, which uses an $\ell_0$ penalty to induce sparsity in the variables. Different techniques for selecting the tuning parameter are discussed and compared. The proposed method stacks up favorably with the most popular regularized $K$-means methods in an extensive simulation study. Finally, HT $K$-means is applied to several real data examples. Graphical displays are presented and used in these examples to gain more insight into the datasets.