论文标题
使用辅助信息解决COVID-19中的选择偏差和测量误差
Addressing selection bias and measurement error in COVID-19 case count data using auxiliary information
论文作者
论文摘要
冠状病毒案件数据影响了政府政策,并推动了大多数流行病学预测。有限的测试被认为是关于Covid-19大流行的最小信息的关键驱动力。尽管扩大测试值得称赞,但测量误差和选择偏见是限制我们对Covid-19大流行的理解的两个最大问题。通过增加测试能力,这两个都无法完全解决。在本文中,我们证明了它们对估计点流行率和有效繁殖数的影响。我们表明,基于美国数百万分子测试的估计与一个小简单的随机样品具有相同的均方误差。为了解决这个问题,提出了一个过程,该程序随着时间的推移结合了案例计数数据和随机样本,以根据关键协变信息估算选择倾向。然后,我们将这些选择倾向与流行病学预测模型相结合,以构建\ emph {doubly稳健}估计方法,该方法既说明了测量 - 误差和选择偏差。然后,将此方法应用于使用人口统计信息的病例计数,住院和死亡数据,从4月25日至29日收集的全州随机分子样本以及Delphi的COVID-19趋势和影响调查的全州随机分子样本,并使用人口统计信息,估计印第安纳州的主动感染患病率。我们以一系列建议根据提议的方法结束。
Coronavirus case-count data has influenced government policies and drives most epidemiological forecasts. Limited testing is cited as the key driver behind minimal information on the COVID-19 pandemic. While expanded testing is laudable, measurement error and selection bias are the two greatest problems limiting our understanding of the COVID-19 pandemic; neither can be fully addressed by increased testing capacity. In this paper, we demonstrate their impact on estimation of point prevalence and the effective reproduction number. We show that estimates based on the millions of molecular tests in the US has the same mean square error as a small simple random sample. To address this, a procedure is presented that combines case-count data and random samples over time to estimate selection propensities based on key covariate information. We then combine these selection propensities with epidemiological forecast models to construct a \emph{doubly robust} estimation method that accounts for both measurement-error and selection bias. This method is then applied to estimate Indiana's active infection prevalence using case-count, hospitalization, and death data with demographic information, a statewide random molecular sample collected from April 25--29th, and Delphi's COVID-19 Trends and Impact Survey. We end with a series of recommendations based on the proposed methodology.