论文标题

简单可扩展的算法,用于集群感知精确药物

Simple and Scalable Algorithms for Cluster-Aware Precision Medicine

论文作者

Buch, Amanda M., Liston, Conor, Grosenick, Logan

论文摘要

通过实现数据驱动的个性化诊断,预后和治疗,AI支持的精确药物可以改善医疗保健结果。然而,众所周知的“维度诅咒”和生物医学数据的簇结构共同相互作用,以在高维,有限的观察精度药物制度中提出联合挑战。为了同时克服这两个问题,我们提出了一种简单且可扩展的方法来以模块化方式将标准嵌入方法与凸聚类惩罚相结合。这种新颖的,群集感知的嵌入方法克服了当前关节嵌入和聚类方法的复杂性和局限性,我们通过直接实现层次聚类的主体成分分析(PCA),局部线性嵌入(LLE)和典型相关分析(CCA)来显示它们。通过数值实验和现实世界的示例,我们证明了我们的方法在高度不确定的问题(例如,只有数十观察)以及大型样本数据集上的传统和当代聚类方法的表现。重要的是,我们的方法不需要用户选择所需的簇数,而是产生可解释的层次聚类嵌入的树状图。因此,我们的方法在识别多组学和神经成像数据中的患者亚组的现有方法上有了显着改善,从而为精密医学提供了可扩展和可解释的生物标志物。

AI-enabled precision medicine promises a transformational improvement in healthcare outcomes by enabling data-driven personalized diagnosis, prognosis, and treatment. However, the well-known "curse of dimensionality" and the clustered structure of biomedical data together interact to present a joint challenge in the high dimensional, limited observation precision medicine regime. To overcome both issues simultaneously we propose a simple and scalable approach to joint clustering and embedding that combines standard embedding methods with a convex clustering penalty in a modular way. This novel, cluster-aware embedding approach overcomes the complexity and limitations of current joint embedding and clustering methods, which we show with straightforward implementations of hierarchically clustered principal component analysis (PCA), locally linear embedding (LLE), and canonical correlation analysis (CCA). Through both numerical experiments and real-world examples, we demonstrate that our approach outperforms traditional and contemporary clustering methods on highly underdetermined problems (e.g., with just tens of observations) as well as on large sample datasets. Importantly, our approach does not require the user to choose the desired number of clusters, but instead yields interpretable dendrograms of hierarchically clustered embeddings. Thus our approach improves significantly on existing methods for identifying patient subgroups in multiomics and neuroimaging data, enabling scalable and interpretable biomarkers for precision medicine.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源