论文标题

合奏聚类通过共同关联矩阵自我增强

Ensemble Clustering via Co-association Matrix Self-enhancement

论文作者

Jia, Yuheng, Tao, Sirui, Wang, Ran, Wang, Yongheng

论文摘要

集合聚类集成了一组基础聚类结果,以生成更强的基础结果。现有方法通常依赖于共同关联(CA)矩阵,该矩阵测量了根据基础聚​​类将两个样品分组为相同群集的多少次以实现集合聚类。但是,当构建的CA矩阵质量低时,性能将降低。在本文中,我们提出了一个简单而有效的CA矩阵自我增强框架,可以改善CA矩阵以实现更好的聚类性能。具体而言,我们首先从基础聚类中提取高信心(HC)信息以形成稀疏的HC矩阵。通过同时将HC矩阵的高度可靠信息传播到CA矩阵并补充HC矩阵,同时,该提出的方法生成了增强的Ca矩阵,以提供更好的群集。从技术上讲,提出的模型被公式为对称约束凸优化问题,该问题通过具有收敛性和全局最佳保证的交替迭代算法有效地解决。与八个基准数据集上的十二种最先进方法进行了广泛的实验比较证实了拟议模型在集合聚类中的有效性,灵活性和效率。可以在https://github.com/siritao/ec-cms下载代码和数据集。

Ensemble clustering integrates a set of base clustering results to generate a stronger one. Existing methods usually rely on a co-association (CA) matrix that measures how many times two samples are grouped into the same cluster according to the base clusterings to achieve ensemble clustering. However, when the constructed CA matrix is of low quality, the performance will degrade. In this paper, we propose a simple yet effective CA matrix self-enhancement framework that can improve the CA matrix to achieve better clustering performance. Specifically, we first extract the high-confidence (HC) information from the base clusterings to form a sparse HC matrix. By propagating the highly-reliable information of the HC matrix to the CA matrix and complementing the HC matrix according to the CA matrix simultaneously, the proposed method generates an enhanced CA matrix for better clustering. Technically, the proposed model is formulated as a symmetric constrained convex optimization problem, which is efficiently solved by an alternating iterative algorithm with convergence and global optimum theoretically guaranteed. Extensive experimental comparisons with twelve state-of-the-art methods on eight benchmark datasets substantiate the effectiveness, flexibility and efficiency of the proposed model in ensemble clustering. The codes and datasets can be downloaded at https://github.com/Siritao/EC-CMS.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源