论文标题
矩阵变化数据的模态聚类
Modal clustering of matrix-variate data
论文作者
论文摘要
基于密度聚类的非参数公式(称为模态聚类)在数据基础上的密度函数模式的吸引域之间绘制了对应关系。它的概率基础允许对基质值设置的方法进行自然而又不是微不足道的概括,例如在纵向和多元时空研究中,越来越广泛。在这项工作中,我们介绍了基于内核方法的基质变量分布的非参数估计量,并分析其渐近性能。此外,我们提出了平均移位程序的概括,以鉴定估计密度的模式。鉴于矩阵变量数据的内在高维度,我们讨论了一些本地自适应解决方案来解决该问题。我们通过广泛的模拟以及一些竞争对手来测试程序,并通过两个高维真实数据应用来说明其性能。
The nonparametric formulation of density-based clustering, known as modal clustering, draws a correspondence between groups and the attraction domains of the modes of the density function underlying the data. Its probabilistic foundation allows for a natural, yet not trivial, generalization of the approach to the matrix-valued setting, increasingly widespread, for example, in longitudinal and multivariate spatio-temporal studies. In this work we introduce nonparametric estimators of matrix-variate distributions based on kernel methods, and analyze their asymptotic properties. Additionally, we propose a generalization of the mean-shift procedure for the identification of the modes of the estimated density. Given the intrinsic high dimensionality of matrix-variate data, we discuss some locally adaptive solutions to handle the problem. We test the procedure via extensive simulations, also with respect to some competitors, and illustrate its performance through two high-dimensional real data applications.