论文标题
使用大量流媒体数据检测谣言保证了延迟
Detecting Rumours with Latency Guarantees using Massive Streaming Data
论文作者
论文摘要
当今的社交网络不断产生大量的数据流,这为谣言开始传播后立即发现了一个宝贵的起点。但是,鉴于社交网络发出的大量高速流数据数据,谣言检测面对当代算法无法满足的紧密延迟界限。因此,在本文中,我们争辩说,最佳效果谣言检测迅速检测到大多数谣言,而不是延迟延迟的所有谣言。为此,我们将技术结合起来,以实现谣言模式的高效,基于图形的匹配以及有效的负载脱落,从而丢弃了一些输入数据,同时最大程度地减少了准确性的损失。大规模现实世界数据集的实验说明了我们方法在不同的流媒体条件下的运行时性能和检测准确性方面的鲁棒性。
Today's social networks continuously generate massive streams of data, which provide a valuable starting point for the detection of rumours as soon as they start to propagate. However, rumour detection faces tight latency bounds, which cannot be met by contemporary algorithms, given the sheer volume of high-velocity streaming data emitted by social networks. Hence, in this paper, we argue for best-effort rumour detection that detects most rumours quickly rather than all rumours with a high delay. To this end, we combine techniques for efficient, graph-based matching of rumour patterns with effective load shedding that discards some of the input data while minimising the loss in accuracy. Experiments with large-scale real-world datasets illustrate the robustness of our approach in terms of runtime performance and detection accuracy under diverse streaming conditions.