论文标题
通过图神经网络来解决可证明的艰苦代表选择
Tackling Provably Hard Representative Selection via Graph Neural Networks
论文作者
论文摘要
代表性选择(RS)是从代表数据集的数据集中找到一小部分示例的问题。在本文中,我们研究了归因图的RS,并专注于找到代表性节点,以优化对所选代表训练的模型的准确性。从理论上讲,我们通过证明其特定的,高度实用的变体(用于学习的RS)在任何合理因素内都难以近似,这意味着广泛使用的替代功能的最佳解决方案与模型的实际精度之间的最佳解决方案之间存在显着的潜在差距。然后,我们研究了(同质)图结构在数据点之间可用或可以构建的设置。我们表明,使用适当的建模方法,这种结构的存在可以将硬RS(用于学习)问题变成可以有效解决的问题。为此,我们开发了RS-GNN,这是一种基于图形神经网络的基于表示学习的RS模型。从经验上讲,我们通过证明RS-GNN在八个基准的套件上证明了RS-GNN特征特征相似性引起的图形特征相似性引起的图形特征相似性引起的图表的有效性以及从节点特征相似性引起的图形问题的有效性。
Representative Selection (RS) is the problem of finding a small subset of exemplars from a dataset that is representative of the dataset. In this paper, we study RS for attributed graphs, and focus on finding representative nodes that optimize the accuracy of a model trained on the selected representatives. Theoretically, we establish a new hardness result forRS (in the absence of a graph structure) by proving that a particular, highly practical variant of it (RS for Learning) is hard to approximate in polynomial time within any reasonable factor, which implies a significant potential gap between the optimum solution of widely-used surrogate functions and the actual accuracy of the model. We then study the setting where a (homophilous) graph structure is available, or can be constructed, between the data points.We show that with an appropriate modeling approach, the presence of such a structure can turn a hard RS (for learning) problem into one that can be effectively solved. To this end, we develop RS-GNN, a representation learning-based RS model based on Graph Neural Networks. Empirically, we demonstrate the effectiveness of RS-GNN on problems with predefined graph structures as well as problems with graphs induced from node feature similarities, by showing that RS-GNN achieves significant improvements over established baselines on a suite of eight benchmarks.