论文标题
Newstweet:在线新闻中嵌入社交媒体的数据集
NewsTweet: A Dataset of Social Media Embedding in Online Journalism
论文作者
论文摘要
近年来,在数字新闻报道中包含社交媒体帖子 - 尤其是在数字新闻报道中,无论是评论还是越来越多的新闻报道,近年来都变得司空见惯。为了用足够的深度研究这种现象,必须从新闻发布者和社交媒体平台上收集强大的大规模数据收集。这项工作描述了这种数据管道的构建。在从Google新闻中收集的数据中,发现所有故事中有13%包括嵌入式推文,体育和娱乐新闻包含其中最大的数量。公众人物和名人被发现主导了这些故事。但是,还发现相对未知的用户可以实现新闻价值。收集的数据集,Newstweet和相关的获取管道,以引起对来自多个研究社区嵌入的社会内容的新询问。
The inclusion of social media posts---tweets, in particular---in digital news stories, both as commentary and increasingly as news sources, has become commonplace in recent years. In order to study this phenomenon with sufficient depth, robust large-scale data collection from both news publishers and social media platforms is necessary. This work describes the construction of such a data pipeline. In the data collected from Google News, 13% of all stories were found to include embedded tweets, with sports and entertainment news containing the largest volumes of them. Public figures and celebrities are found to dominate these stories; however, relatively unknown users have also been found to achieve newsworthiness. The collected data set, NewsTweet, and the associated pipeline for acquisition stand to engender a wave of new inquiries into social content embedding from multiple research communities.