论文标题
社交媒体中主题趋势的叙事概率模型:一个离散的时间模型
Probabilistic Model of Narratives Over Topical Trends in Social Media: A Discrete Time Model
论文作者
论文摘要
在线社交媒体平台正在变成有关全球活动的新闻和叙述的主要来源。但是,缺乏一种系统的基于摘要的叙事提取,可以促进交流主要的基础事件。为了解决这个问题,我们提出了一个基于事件的叙事摘要提取框架。我们提出的框架被设计为具有分类时间分布的概率主题模型,然后是提取文本摘要。我们的主题模型以不同的时间分辨率确定了主题随着时间的流逝。该框架不仅可以从数据中捕获主题分布,而且还可以随着时间的推移近似用户活动波动。此外,我们将显着性折衷权衡(SDT)定义为比较措施,以确定在时间浪费的语料库中具有最高寿命吸引力的主题。我们在大量的Twitter数据中评估了我们的模型,其中包括针对叙利亚白色头盔的虚假信息领域中超过一百万条推文。我们的结果表明,所提出的框架可有效地识别局部趋势,并通过时间戳数据从文本语料库中提取叙事摘要。
Online social media platforms are turning into the prime source of news and narratives about worldwide events. However,a systematic summarization-based narrative extraction that can facilitate communicating the main underlying events is lacking. To address this issue, we propose a novel event-based narrative summary extraction framework. Our proposed framework is designed as a probabilistic topic model, with categorical time distribution, followed by extractive text summarization. Our topic model identifies topics' recurrence over time with a varying time resolution. This framework not only captures the topic distributions from the data, but also approximates the user activity fluctuations over time. Furthermore, we define significance-dispersity trade-off (SDT) as a comparison measure to identify the topic with the highest lifetime attractiveness in a timestamped corpus. We evaluate our model on a large corpus of Twitter data, including more than one million tweets in the domain of the disinformation campaigns conducted against the White Helmets of Syria. Our results indicate that the proposed framework is effective in identifying topical trends, as well as extracting narrative summaries from text corpus with timestamped data.