通过图学习近似网络图案挖掘

论文标题

通过图学习近似网络图案挖掘

Approximate Network Motif Mining Via Graph Learning

论文作者

Oliver, Carlos, Chen, Dexiong, Mallet, Vincent, Philippopoulos, Pericles, Borgwardt, Karsten

论文摘要

频繁且与结构相关的子图（也称为网络基础图）是许多图形数据集的宝贵特征。但是，在任意数据集（Motif挖掘）中识别主题集的高计算复杂性限制了它们在许多现实世界数据集中的使用。通过自动利用数据集的统计属性，机器学习方法在具有组合复杂性的几个任务中显示出了希望，因此是网络图案挖掘的有前途的候选人。在这项工作中，我们试图促进针对图案采矿的机器学习方法的开发。我们提出了一个节点标记任务的图案挖掘问题的公式。此外，我们构建了基准数据集和评估指标，这些指标测试了模型捕获主题发现不同方面的能力，例如主题，大小，拓扑和稀缺性。接下来，我们提出了Motifiesta，这是第一次以完全可区分的方式解决此问题的尝试，并在具有挑战性的基准方面有希望的结果。最后，我们通过Motifiesta证明，该学习设置可以同时应用于用于图形分类任务的通用数据挖掘和可解释的特征提取。

Frequent and structurally related subgraphs, also known as network motifs, are valuable features of many graph datasets. However, the high computational complexity of identifying motif sets in arbitrary datasets (motif mining) has limited their use in many real-world datasets. By automatically leveraging statistical properties of datasets, machine learning approaches have shown promise in several tasks with combinatorial complexity and are therefore a promising candidate for network motif mining. In this work we seek to facilitate the development of machine learning approaches aimed at motif mining. We propose a formulation of the motif mining problem as a node labelling task. In addition, we build benchmark datasets and evaluation metrics which test the ability of models to capture different aspects of motif discovery such as motif number, size, topology, and scarcity. Next, we propose MotiFiesta, a first attempt at solving this problem in a fully differentiable manner with promising results on challenging baselines. Finally, we demonstrate through MotiFiesta that this learning setting can be applied simultaneously to general-purpose data mining and interpretable feature extraction for graph classification tasks.

下载PDF全文

下载文献需遵守相关版权规定

论文标题