论文标题

二进制矩阵分解通过列的生成

Binary Matrix Factorisation via Column Generation

论文作者

Kovacs, Reka A., Gunluk, Oktay, Hauser, Raphael A.

论文摘要

识别二进制数据中的离散模式是机器学习和数据挖掘中的重要维度降低工具。在本文中,我们考虑了布尔算术下的低级二进制基质分解(BMF)的问题。由于这个问题的硬度,大多数以前的尝试都取决于启发式技术。我们将问题提出为混合整数线性程序,并使用列生成的大规模优化技术来解决它,而无需启发式模式挖掘。我们的方法着重于准确性和提供最佳保证。现实世界数据集的实验结果表明,我们提出的方法有效地产生了高度准确的分解,并改善了24个问题实例中15个以前可用的最著名结果。

Identifying discrete patterns in binary data is an important dimensionality reduction tool in machine learning and data mining. In this paper, we consider the problem of low-rank binary matrix factorisation (BMF) under Boolean arithmetic. Due to the hardness of this problem, most previous attempts rely on heuristic techniques. We formulate the problem as a mixed integer linear program and use a large scale optimisation technique of column generation to solve it without the need of heuristic pattern mining. Our approach focuses on accuracy and on the provision of optimality guarantees. Experimental results on real world datasets demonstrate that our proposed method is effective at producing highly accurate factorisations and improves on the previously available best known results for 15 out of 24 problem instances.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源