论文标题

对话结构意识和上下文敏感主题模型用于在线讨论

Conversational Structure Aware and Context Sensitive Topic Model for Online Discussions

论文作者

Sun, Yingcheng, Loparo, Kenneth, Kolacinski, Richard

论文摘要

每天在社交媒体平台上进行数百万个在线讨论。主题建模是一种更好地理解大型文本数据集的有效方法。传统的主题模型在在线讨论中取得了有限的成功,并且为了克服其局限性,我们使用讨论线程树结构并提出“普及”度量标准来量化评论的答复数量以扩展单词出现的频率,并在嵌套讨论线程中的节点之间表征主题依赖性。我们基于对推断主题及其对评论的分配的流行和传递性建立对话结构意识主题模型(CSATM)。实际论坛数据集上的实验用于证明具有六个不同的连贯性和令人印象深刻的主题分配精度的不同测量值的主题提取的性能。

Millions of online discussions are generated everyday on social media platforms. Topic modelling is an efficient way of better understanding large text datasets at scale. Conventional topic models have had limited success in online discussions, and to overcome their limitations, we use the discussion thread tree structure and propose a "popularity" metric to quantify the number of replies to a comment to extend the frequency of word occurrences, and the "transitivity" concept to characterize topic dependency among nodes in a nested discussion thread. We build a Conversational Structure Aware Topic Model (CSATM) based on popularity and transitivity to infer topics and their assignments to comments. Experiments on real forum datasets are used to demonstrate improved performance for topic extraction with six different measurements of coherence and impressive accuracy for topic assignments.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源