GflowCausal：因果发现的生成流网络

论文标题

GflowCausal：因果发现的生成流网络

GFlowCausal: Generative Flow Networks for Causal Discovery

论文作者

Li, Wenqian, Li, Yinchuan, Zhu, Shengyu, Shao, Yunfeng, Hao, Jianye, Pang, Yan

论文摘要

因果发现旨在发现一组变量之间的因果结构。基于得分的方法主要集中于基于预定义的分数函数搜索最佳的有向无环图（DAG）。但是，由于可搜索性有限，其中大多数不适合大规模适用。受生成流网络中的主动学习的启发，我们提出了一种新的方法，可以从称为GflowCausal的观察数据中学习DAG。它将图形搜索问题转换为一代问题，其中逐渐添加了直接边缘。 GflowCausal旨在通过顺序的操作来学习最佳策略，以与预定义的奖励成正比生成高回报的DAG。我们根据及时闭合提出了一个插件模块，以确保有效的采样。理论分析表明，该模块可以有效地保证过度属性以及最终状态和完全连接的图之间的一致性。我们对合成数据集和实际数据集进行了广泛的实验，结果表明了提出的方法是出色的，并且在大规模环境中也表现良好。

Causal discovery aims to uncover causal structure among a set of variables. Score-based approaches mainly focus on searching for the best Directed Acyclic Graph (DAG) based on a predefined score function. However, most of them are not applicable on a large scale due to the limited searchability. Inspired by the active learning in generative flow networks, we propose a novel approach to learning a DAG from observational data called GFlowCausal. It converts the graph search problem to a generation problem, in which direct edges are added gradually. GFlowCausal aims to learn the best policy to generate high-reward DAGs by sequential actions with probabilities proportional to predefined rewards. We propose a plug-and-play module based on transitive closure to ensure efficient sampling. Theoretical analysis shows that this module could guarantee acyclicity properties effectively and the consistency between final states and fully-connected graphs. We conduct extensive experiments on both synthetic and real datasets, and results show the proposed approach to be superior and also performs well in a large-scale setting.

下载PDF全文

下载文献需遵守相关版权规定

论文标题