论文标题
截短的矩阵功率迭代用于可区分的DAG学习
Truncated Matrix Power Iteration for Differentiable DAG Learning
论文作者
论文摘要
从观察数据中恢复潜在的定向无环图(DAG)结构,由于DAG限制的优化问题的组合性质,因此极具挑战性。最近,通过将DAG约束将DAG限制定义为平滑的相等性,通常基于邻接矩阵上的多项式,将DAG学习作为连续优化问题。现有方法将非常小的系数放在高阶多项式术语上以进行稳定,因为它们认为由于数字爆炸而导致高阶项上的大系数有害。相反,我们发现,当邻接矩阵的光谱射线量很小时,高阶术语上的大系数对DAG学习有益,并且高阶项的较大系数可以比小型零件要好得多。基于此,我们提出了一种具有有效截短的矩阵功率迭代的新型DAG学习方法,以基于几何序列的DAG约束。从经验上讲,我们的DAG学习方法在各种环境中的表现都优于先前的最新方法,在结构锤距离上通常以$ 3 $或更多的倍数。
Recovering underlying Directed Acyclic Graph (DAG) structures from observational data is highly challenging due to the combinatorial nature of the DAG-constrained optimization problem. Recently, DAG learning has been cast as a continuous optimization problem by characterizing the DAG constraint as a smooth equality one, generally based on polynomials over adjacency matrices. Existing methods place very small coefficients on high-order polynomial terms for stabilization, since they argue that large coefficients on the higher-order terms are harmful due to numeric exploding. On the contrary, we discover that large coefficients on higher-order terms are beneficial for DAG learning, when the spectral radiuses of the adjacency matrices are small, and that larger coefficients for higher-order terms can approximate the DAG constraints much better than the small counterparts. Based on this, we propose a novel DAG learning method with efficient truncated matrix power iteration to approximate geometric series based DAG constraints. Empirically, our DAG learning method outperforms the previous state-of-the-arts in various settings, often by a factor of $3$ or more in terms of structural Hamming distance.