论文标题
与观测数据的分布在分布方面的因果推断
Distributionally Robust Causal Inference with Observational Data
论文作者
论文摘要
我们考虑观察性研究中平均治疗效应的估计,并提出了一种与未观察到的混杂因素的鲁棒性因果推断的新框架。我们的方法基于分配强大的优化,并分为两个步骤进行。我们首先指定未观察到的潜在结果的分布可能偏离观察到的结果的最大程度。然后,在此假设下,我们对平均治疗效应的平均治疗效果得出了锐利的界限。我们的框架涵盖了流行的边际灵敏度模型作为一种特殊情况,我们演示了所提出的方法论如何解决边缘灵敏度模型的主要挑战,当未观察到的混杂因素实质上影响治疗和结果时,它会产生非信息结果。具体而言,在假设未观察到的变量相对较小的假设下,我们开发了一种称为分布灵敏度模型的替代灵敏度模型,称为分布敏感性模型。与边际灵敏度模型不同,分布敏感性模型允许潜在的重叠可能缺乏重叠,并且即使未观察到的变量也很大程度上影响了治疗和结果,也经常产生信息界限。最后,我们展示了如何将分布敏感性模型扩展到具有仪器变量的差异差异设计和设置。通过模拟和实证研究,我们证明了所提出的方法的适用性。
We consider the estimation of average treatment effects in observational studies and propose a new framework of robust causal inference with unobserved confounders. Our approach is based on distributionally robust optimization and proceeds in two steps. We first specify the maximal degree to which the distribution of unobserved potential outcomes may deviate from that of observed outcomes. We then derive sharp bounds on the average treatment effects under this assumption. Our framework encompasses the popular marginal sensitivity model as a special case, and we demonstrate how the proposed methodology can address a primary challenge of the marginal sensitivity model that it produces uninformative results when unobserved confounders substantially affect treatment and outcome. Specifically, we develop an alternative sensitivity model, called the distributional sensitivity model, under the assumption that heterogeneity of treatment effect due to unobserved variables is relatively small. Unlike the marginal sensitivity model, the distributional sensitivity model allows for potential lack of overlap and often produces informative bounds even when unobserved variables substantially affect both treatment and outcome. Finally, we show how to extend the distributional sensitivity model to difference-in-differences designs and settings with instrumental variables. Through simulation and empirical studies, we demonstrate the applicability of the proposed methodology.