论文标题
神经主题建模通过合并文档关系图
Neural Topic Modeling by Incorporating Document Relationship Graph
论文作者
论文摘要
通过消息传递捕获图节点之间关系的图形神经网络(GNN)一直是自然语言处理社区的热门研究方向。在本文中,我们提出了一个基于GNN的神经主题模型的图形主题模型(GTM),该模型代表语料库作为文档关系图。语料库中的文档和单词变成图中的节点,并根据文档 - 单词共发生连接。通过介绍图形结构,文档之间的关系是通过其共享单词建立的,因此,通过使用图形卷积从其相邻节点汇总信息来丰富文档的局部表示。进行了三个数据集的广泛实验,结果证明了该方法的有效性。
Graph Neural Networks (GNNs) that capture the relationships between graph nodes via message passing have been a hot research direction in the natural language processing community. In this paper, we propose Graph Topic Model (GTM), a GNN based neural topic model that represents a corpus as a document relationship graph. Documents and words in the corpus become nodes in the graph and are connected based on document-word co-occurrences. By introducing the graph structure, the relationships between documents are established through their shared words and thus the topical representation of a document is enriched by aggregating information from its neighboring nodes using graph convolution. Extensive experiments on three datasets were conducted and the results demonstrate the effectiveness of the proposed approach.