达格玛：通过M矩阵学习DAGS和对数确定的环形表征

论文标题

达格玛：通过M矩阵学习DAGS和对数确定的环形表征

DAGMA: Learning DAGs via M-matrices and a Log-Determinant Acyclicity Characterization

论文作者

Bello, Kevin, Aragam, Bryon, Ravikumar, Pradeep

论文摘要

从数据中学习定向的无环图（DAG）的组合问题最近被构成了纯连续优化问题，它通过基于矩阵指数函数的跟踪来利用DAG的可区分的无环表征。现有的无环特征基于以下想法：邻接矩阵的功率包含有关步行和周期的信息。在这项工作中，我们根据log-diperminant（log-det）函数提出了一种新的无环特性，该功能利用了DAG的nilpotency属性。为了处理DAG的固有不对称性，我们将日志数据表征的域与$ \ textit {M-matrices} $集合在一起，这是与正面确定矩阵锥体定义的经典日志函数的关键区别。与先前提出的无环函数相似，我们的表征也是精确且可区分的。但是，与现有特征相比，我们的对数数据函数：（1）更好地检测大周期；（2）行为更好的梯度；（3）它的运行时间在实践中的数量级更快。从优化方面，我们删除了典型的增强拉格朗日方案，并提出了dagma（$ \ textit {dags via d dags via in m-matrices for Acyclicity} $），这种方法类似于障碍方法的中心路径。 DAGMA的中心路径中的每个点都是通过我们的对数数据函数正常的无约束问题的解决方案，然后我们表明，在中心路径的极限上，确保解决方案是DAG。最后，我们为$ \ textit {Linear} $和$ \ textit {nonlinear} $ SEMS提供了广泛的实验，并表明我们的方法可以达到针对最先进方法的大加速和较小的结构性锤击距离。实施该方法的代码是开源的，并在https://github.com/kevinsbello/dagma上公开获得。

The combinatorial problem of learning directed acyclic graphs (DAGs) from data was recently framed as a purely continuous optimization problem by leveraging a differentiable acyclicity characterization of DAGs based on the trace of a matrix exponential function. Existing acyclicity characterizations are based on the idea that powers of an adjacency matrix contain information about walks and cycles. In this work, we propose a new acyclicity characterization based on the log-determinant (log-det) function, which leverages the nilpotency property of DAGs. To deal with the inherent asymmetries of a DAG, we relate the domain of our log-det characterization to the set of $\textit{M-matrices}$, which is a key difference to the classical log-det function defined over the cone of positive definite matrices. Similar to acyclicity functions previously proposed, our characterization is also exact and differentiable. However, when compared to existing characterizations, our log-det function: (1) Is better at detecting large cycles; (2) Has better-behaved gradients; and (3) Its runtime is in practice about an order of magnitude faster. From the optimization side, we drop the typically used augmented Lagrangian scheme and propose DAGMA ($\textit{DAGs via M-matrices for Acyclicity}$), a method that resembles the central path for barrier methods. Each point in the central path of DAGMA is a solution to an unconstrained problem regularized by our log-det function, then we show that at the limit of the central path the solution is guaranteed to be a DAG. Finally, we provide extensive experiments for $\textit{linear}$ and $\textit{nonlinear}$ SEMs and show that our approach can reach large speed-ups and smaller structural Hamming distances against state-of-the-art methods. Code implementing the proposed method is open-source and publicly available at https://github.com/kevinsbello/dagma.

下载PDF全文

下载文献需遵守相关版权规定

论文标题