论文标题

用于挖掘数据的整数线性编程框架

An Integer Linear Programming Framework for Mining Constraints from Data

论文作者

Meng, Tao, Chang, Kai-Wei

论文摘要

结构化的输出预测问题(例如,顺序标记,分层多类分类)通常涉及在输出标签空间上的约束。这些约束与学习的模型相互作用,以过滤不可行的解决方案并促进构建负责任的系统。但是,尽管约束很有用,但它们通常是基于手工制作的规则。这就提出了一个问题 - \ emph {我们可以根据学习算法从数据中挖掘约束和规则吗?} 在本文中,我们提出了一个从数据中挖掘约束的一般框架。特别是,我们将结构化输出预测的推断视为整数线性编程(ILP)问题。然后,鉴于目标函数的系数和相应的解决方案,我们通过估计可行集合的外部和内部多面体来挖掘潜在的约束。我们验证了各种合成和现实世界应用中提出的约束挖掘算法,并证明所提出的方法成功地识别了可行的设置。 特别是,我们表明我们的方法可以学会解决9x9 sudoku难题,而从示例中却不提供基本规则,从而最小化了树的问题。我们的算法还可以与神经网络模型集成,以学习多标签分类任务的层次标签结构。此外,我们提供了关于多面体的紧密性和采矿约束的可靠性的理论分析。

Structured output prediction problems (e.g., sequential tagging, hierarchical multi-class classification) often involve constraints over the output label space. These constraints interact with the learned models to filter infeasible solutions and facilitate in building an accountable system. However, although constraints are useful, they are often based on hand-crafted rules. This raises a question -- \emph{can we mine constraints and rules from data based on a learning algorithm?} In this paper, we present a general framework for mining constraints from data. In particular, we consider the inference in structured output prediction as an integer linear programming (ILP) problem. Then, given the coefficients of the objective function and the corresponding solution, we mine the underlying constraints by estimating the outer and inner polytopes of the feasible set. We verify the proposed constraint mining algorithm in various synthetic and real-world applications and demonstrate that the proposed approach successfully identifies the feasible set at scale. In particular, we show that our approach can learn to solve 9x9 Sudoku puzzles and minimal spanning tree problems from examples without providing the underlying rules. Our algorithm can also integrate with a neural network model to learn the hierarchical label structure of a multi-label classification task. Besides, we provide a theoretical analysis about the tightness of the polytopes and the reliability of the mined constraints.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源