论文标题

通过相关的高斯流程进行的弱监督的多输出回归

Weakly-supervised Multi-output Regression via Correlated Gaussian Processes

论文作者

Chung, Seokhyun, Kontar, Raed Al, Wu, Zhenke

论文摘要

多输出回归旨在借用强度并利用不同但相关的输出范围内的共同点,以提高学习和预测准确性。一个基本的假设是已知所有观察值的输出/组成员标签。在实际应用中通常会违反此假设。例如,在医疗保健数据集中,诸如种族之类的敏感属性通常缺失或未报告。为此,我们基于依赖的高斯流程引入了一个弱监督的多输出模型。我们的方法能够在没有完整的小组标签或可能仅先前对小组成员身份的先前信念中利用数据来提高所有输出的准确性。通过对胰岛素,睾丸激素和BodyFat数据集进行密集的模拟和案例研究,我们表明我们的模型在具有缺失标签的多输出设置中表现出色,同时在传统的完全标签的设置中具有竞争力。我们结束时强调了在公平的推理和顺序决策中可能使用我们的方法。

Multi-output regression seeks to borrow strength and leverage commonalities across different but related outputs in order to enhance learning and prediction accuracy. A fundamental assumption is that the output/group membership labels for all observations are known. This assumption is often violated in real applications. For instance, in healthcare datasets, sensitive attributes such as ethnicity are often missing or unreported. To this end, we introduce a weakly-supervised multi-output model based on dependent Gaussian processes. Our approach is able to leverage data without complete group labels or possibly only prior belief on group memberships to enhance accuracy across all outputs. Through intensive simulations and case studies on an Insulin, Testosterone and Bodyfat dataset, we show that our model excels in multi-output settings with missing labels, while being competitive in traditional fully labeled settings. We end by highlighting the possible use of our approach in fair inference and sequential decision-making.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源