论文标题

保留设定的安全预测模型更新

Holdouts set for safe predictive model updating

论文作者

Haidar-Wehbe, Sami, Emerson, Samuel R, Aslett, Louis J M, Liley, James

论文摘要

不良结果的预测风险评分对于指导健康干预措施越来越重要。由于它们建模的分布的变化,可能需要定期更新此类分数。但是,直接更新用于指导干预的风险评分可能会导致风险估计。为了解决这个问题,我们建议使用“保留集”进行更新 - 不接受风险分数指导的干预措施的一个子集。平衡固定设置的大小对于确保更新的风险评分表现良好,同时最大程度地降低了被扣除样品的数量,至关重要。我们证明,这种方法将不良结果频率降低到渐近的最佳水平,并认为通常没有竞争性替代方案。我们描述了可以轻松识别最佳保持尺寸(OHS)的条件,并为OHS估计引入参数和半参数算法。我们将我们的方法应用于前机会的ASPRE风险分数,以建议在存在基础数据分布的情况下更新它的计划。我们表明,为了最大程度地减少随着时间的流逝案例案例的数量,最好使用约10,000个人的持有组来实现这一点。

Predictive risk scores for adverse outcomes are increasingly crucial in guiding health interventions. Such scores may need to be periodically updated due to change in the distributions they model. However, directly updating risk scores used to guide intervention can lead to biased risk estimates. To address this, we propose updating using a `holdout set' - a subset of the population that does not receive interventions guided by the risk score. Balancing the holdout set size is essential to ensure good performance of the updated risk score whilst minimising the number of held out samples. We prove that this approach reduces adverse outcome frequency to an asymptotically optimal level and argue that often there is no competitive alternative. We describe conditions under which an optimal holdout size (OHS) can be readily identified, and introduce parametric and semi-parametric algorithms for OHS estimation. We apply our methods to the ASPRE risk score for pre-eclampsia to recommend a plan for updating it in the presence of change in the underlying data distribution. We show that, in order to minimise the number of pre-eclampsia cases over time, this is best achieved using a holdout set of around 10,000 individuals.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源