论文标题
验证因果推理方法
Validating Causal Inference Methods
论文作者
论文摘要
绘制因果推断的基本挑战是,任何单位都没有完全观察到反事实。此外,在观察性研究中,治疗分配可能会混淆。在不满足的条件下,已经出现了许多统计学方法,这些方法在给定预处理协变量的情况下,包括基于倾向得分的方法,基于预后分数的方法和双重强大的方法。不幸的是,对于应用研究人员而言,没有“千篇一律”的因果方法可以在普遍上表现出色。实际上,因果方法主要在手工制作的模拟数据上进行定量评估。这样的数据产生过程可能具有有限的价值,因为它们通常是现实的程式化模型。它们被简化为易于障碍,缺乏现实世界数据的复杂性。对于应用研究人员,了解方法对手头数据的表现效果很好至关重要。我们的工作介绍了基于生成模型的深层框架,以验证因果推理方法。该框架的新颖性源于其产生锚定在观察到的样品的经验分布的合成数据的能力,因此与后者几乎没有区别。该方法允许用户为因果效应的形式和幅度指定地面真理,并将偏见混淆为协变量的功能。因此,模拟数据集用于评估与观察到的样本相似的数据时,各种因果估计方法的潜在性能。我们证明了Credence在广泛的模拟研究中准确评估因果估计技术的相对性能以及来自Lalonde和Project Star研究的两个现实世界数据应用的能力。
The fundamental challenge of drawing causal inference is that counterfactual outcomes are not fully observed for any unit. Furthermore, in observational studies, treatment assignment is likely to be confounded. Many statistical methods have emerged for causal inference under unconfoundedness conditions given pre-treatment covariates, including propensity score-based methods, prognostic score-based methods, and doubly robust methods. Unfortunately for applied researchers, there is no `one-size-fits-all' causal method that can perform optimally universally. In practice, causal methods are primarily evaluated quantitatively on handcrafted simulated data. Such data-generative procedures can be of limited value because they are typically stylized models of reality. They are simplified for tractability and lack the complexities of real-world data. For applied researchers, it is critical to understand how well a method performs for the data at hand. Our work introduces a deep generative model-based framework, Credence, to validate causal inference methods. The framework's novelty stems from its ability to generate synthetic data anchored at the empirical distribution for the observed sample, and therefore virtually indistinguishable from the latter. The approach allows the user to specify ground truth for the form and magnitude of causal effects and confounding bias as functions of covariates. Thus simulated data sets are used to evaluate the potential performance of various causal estimation methods when applied to data similar to the observed sample. We demonstrate Credence's ability to accurately assess the relative performance of causal estimation techniques in an extensive simulation study and two real-world data applications from Lalonde and Project STAR studies.