论文标题

使用自我网络测量共发生数据的相似性

Measuring similarity in co-occurrence data using ego-networks

论文作者

Wang, Xiaomeng, Ran, Yijun, Jia, Tao

论文摘要

在许多经验数据中广泛观察到同时的关联。在同时出现数据中挖掘信息对于促进我们对社交网络,生态系统和大脑网络等系统的理解至关重要。衡量实体的相似性是重要的任务之一,通常可以使用基于网络的方法来实现。在这里,我们表明,基于汇总网络的传统方法可以带来不必要的指向关系。为了解决这个问题,我们提出了一个基于每个实体的自我网络的相似性度量,该措施有效地考虑了实体的中心性从一个自我网络更改为另一个自我网络。提出的索引易于计算,并且具有明确的物理含义。使用两个不同的数据集,我们将新索引与其他现有索引进行比较。我们发现,新索引的表现优于传统的基于网络的相似性度量,有时可能会超过嵌入方法。同时,新指数的度量与其他方法的量度弱相关,因此提供了不同的维度来量化共发生数据中的相似性。总的来说,我们的工作在基于网络的相似性度量中扩展了一个扩展,并且可以在几个相关任务中应用。

The co-occurrence association is widely observed in many empirical data. Mining the information in co-occurrence data is essential for advancing our understanding of systems such as social networks, ecosystem, and brain network. Measuring similarity of entities is one of the important tasks, which can usually be achieved using a network-based approach. Here we show that traditional methods based on the aggregated network can bring unwanted in-directed relationship. To cope with this issue, we propose a similarity measure based on the ego network of each entity, which effectively considers the change of an entity's centrality from one ego network to another. The index proposed is easy to calculate and has a clear physical meaning. Using two different data sets, we compare the new index with other existing ones. We find that the new index outperforms the traditional network-based similarity measures, and it can sometimes surpass the embedding method. In the meanwhile, the measure by the new index is weakly correlated with those by other methods, hence providing a different dimension to quantify similarities in co-occurrence data. Altogether, our work makes an extension in the network-based similarity measure and can be potentially applied in several related tasks.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源