论文标题
调查数据的基于人群的层次非负矩阵分解
Population-Based Hierarchical Non-negative Matrix Factorization for Survey Data
论文作者
论文摘要
由识别包含各种复杂数据类型的现代调查数据上潜在的层次种群结构的问题引起的,我们引入了基于人群的层次分层非负矩阵分解(PHNMF)。 PHNMF是基于特征相似性的分层非负矩阵分解的变体。因此,它启用了一种自动且可解释的方法,用于识别和理解由广泛的数据类型构建的数据矩阵中的层次结构。我们对合成和实际调查数据的数值实验表明,PHNMF可以以高精度恢复复杂数据中潜在的分层种群结构。此外,回收的亚群结构是有意义的,对于改善下游推断可能是有意义的。
Motivated by the problem of identifying potential hierarchical population structure on modern survey data containing a wide range of complex data types, we introduce population-based hierarchical non-negative matrix factorization (PHNMF). PHNMF is a variant of hierarchical non-negative matrix factorization based on feature similarity. As such, it enables an automatic and interpretable approach for identifying and understanding hierarchical structure in a data matrix constructed from a wide range of data types. Our numerical experiments on synthetic and real survey data demonstrate that PHNMF can recover latent hierarchical population structure in complex data with high accuracy. Moreover, the recovered subpopulation structure is meaningful and can be useful for improving downstream inference.