论文标题

无监督的在线功能选择用于成本敏感的医学诊断

Unsupervised Online Feature Selection for Cost-Sensitive Medical Diagnosis

论文作者

Verma, Arun, Hanawal, Manjesh K., Hemachandra, Nandyala

论文摘要

在医学诊断中,医生通过检查从一系列测试(例如,血液检查,尿液检查)获得的测量测量值(特征)来预测患者的状态,然后进行侵入性测试。由于测试通常是昂贵的,因此只想获得可以确定国家存在或不存在的那些特征(测试)。医学诊断的另一个方面是,我们经常面临无监督的预测任务,因为可能不知道患者的真实状态。在此类医学诊断问题的推动下,我们考虑了一个{\ IT成本敏感的医学诊断}(CSMD)问题,其中真正的患者状态尚不清楚。我们将CSMD问题提出为特征选择问题,每个测试都提供了可以在预测模型中使用的功能。我们的目标是学习选择能够在准确性和成本之间取得最佳权衡的功能的策略。我们利用问题的“弱位”属性来开发在线算法,这些算法识别一组功能,这些功能在成本和预测的准确性之间提供了“最佳”权衡,而无需了解医疗状况的真实状态。我们的经验结果验证了我们算法在实际数据集生成的问题实例上的性能。

In medical diagnosis, physicians predict the state of a patient by checking measurements (features) obtained from a sequence of tests, e.g., blood test, urine test, followed by invasive tests. As tests are often costly, one would like to obtain only those features (tests) that can establish the presence or absence of the state conclusively. Another aspect of medical diagnosis is that we are often faced with unsupervised prediction tasks as the true state of the patients may not be known. Motivated by such medical diagnosis problems, we consider a {\it Cost-Sensitive Medical Diagnosis} (CSMD) problem, where the true state of patients is unknown. We formulate the CSMD problem as a feature selection problem where each test gives a feature that can be used in a prediction model. Our objective is to learn strategies for selecting the features that give the best trade-off between accuracy and costs. We exploit the `Weak Dominance' property of problem to develop online algorithms that identify a set of features which provides an `optimal' trade-off between cost and accuracy of prediction without requiring to know the true state of the medical condition. Our empirical results validate the performance of our algorithms on problem instances generated from real-world datasets.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源