论文标题
使用R软件包CFID识别反事实查询
Identifying Counterfactual Queries with the R Package cfid
论文作者
论文摘要
在结构性因果模型的框架中,反事实查询描述了有关系统中多个替代状态的事件。反事实查询通常采用“如果”类型的问题的形式,例如“如果申请人拥有超过10年的经验,他们会被雇用,而实际上他们只有5年的经验?”通常,这些问题和反事实推论至关重要,例如,在解决决策中的公平性问题时。由于反事实事件包含世界上矛盾的状态,因此不可能进行随机实验来解决它们,而无需做出一些限制性假设。但是,有时可以通过将所研究的系统表示为因果模型,而可用数据作为符号概率分布来从观察性和实验数据中识别此类查询。 Shpitser和Pearl(2007)构建了两种称为ID*和IDC*的算法,分别识别反事实查询和条件反事实查询。这两种算法类似于Shpitser and Pearl(2006)用于识别介入的ID和IDC算法,这是由Tikka和Karvanen(2017)在Causaleffect软件包中在R中实现的。我们提出了实现ID*和IDC*算法的R软件包CFID。通过示例证明了反事实查询和CFID的特征的识别。
In the framework of structural causal models, counterfactual queries describe events that concern multiple alternative states of the system under study. Counterfactual queries often take the form of "what if" type questions such as "would an applicant have been hired if they had over 10 years of experience, when in reality they only had 5 years of experience?" Such questions and counterfactual inference in general are crucial, for example when addressing the problem of fairness in decision-making. Because counterfactual events contain contradictory states of the world, it is impossible to conduct a randomized experiment to address them without making several restrictive assumptions. However, it is sometimes possible to identify such queries from observational and experimental data by representing the system under study as a causal model, and the available data as symbolic probability distributions. Shpitser and Pearl (2007) constructed two algorithms, called ID* and IDC*, for identifying counterfactual queries and conditional counterfactual queries, respectively. These two algorithms are analogous to the ID and IDC algorithms by Shpitser and Pearl (2006) for identification of interventional distributions, which were implemented in R by Tikka and Karvanen (2017) in the causaleffect package. We present the R package cfid that implements the ID* and IDC* algorithms. Identification of counterfactual queries and the features of cfid are demonstrated via examples.