像您看到的那样写入它：竞赛中可检测到的临床笔记差异导致差异模型建议

论文标题

像您看到的那样写入它：竞赛中可检测到的临床笔记差异导致差异模型建议

Write It Like You See It: Detectable Differences in Clinical Notes By Race Lead To Differential Model Recommendations

论文作者

Adam, Hammaad, Yang, Ming Ying, Cato, Kenrick, Baldini, Ioana, Senteio, Charles, Celi, Leo Anthony, Zeng, Jiaming, Singh, Moninder, Ghassemi, Marzyeh

论文摘要

临床注释已成为医疗保健中机器学习（ML）应用程序越来越重要的数据源。先前的研究表明，部署ML模型可以使现有的偏见与种族少数群体持续存在，因为可以将偏见隐含地嵌入数据中。在这项研究中，我们研究了ML模型和人类专家可用的隐式种族信息的水平，以及在临床注释中可检测到的模型可检测差异的含义。我们的工作做出了三个关键的贡献。首先，我们发现模型可以从临床笔记中识别出患者自我报告的种族，即使笔记被剥夺了明确的种族指标。其次，我们确定人类专家无法从相同的编辑临床笔记中准确预测患者种族。最后，我们在一项模拟研究中证明了这种隐式信息的潜在危害，并表明对这些种族竞赛临床笔记进行培训的模型仍然可以使临床治疗决策中的现有偏见永存。

Clinical notes are becoming an increasingly important data source for machine learning (ML) applications in healthcare. Prior research has shown that deploying ML models can perpetuate existing biases against racial minorities, as bias can be implicitly embedded in data. In this study, we investigate the level of implicit race information available to ML models and human experts and the implications of model-detectable differences in clinical notes. Our work makes three key contributions. First, we find that models can identify patient self-reported race from clinical notes even when the notes are stripped of explicit indicators of race. Second, we determine that human experts are not able to accurately predict patient race from the same redacted clinical notes. Finally, we demonstrate the potential harm of this implicit information in a simulation study, and show that models trained on these race-redacted clinical notes can still perpetuate existing biases in clinical treatment decisions.

下载PDF全文

下载文献需遵守相关版权规定

论文标题