使用Graph Transformer网络在复杂文档布局中对文本的语义进行建模

论文标题

使用Graph Transformer网络在复杂文档布局中对文本的语义进行建模

Modelling the semantics of text in complex document layouts using graph transformer networks

论文作者

Barillot, Thomas Roland, Saks, Jacob, Lilyanova, Polena, Torgas, Edward, Hu, Yachen, Liu, Yuanqing, Balupuri, Varun, Gaskell, Paul

论文摘要

代表复杂文档的结构化文本通常需要采用不同的机器学习技术，例如用于表提取的段落和卷积神经网络（CNN）的语言模型，这禁止从不同内容类型的文本跨度之间进行绘制绘制联系。 In this article we propose a model that approximates the human reading pattern of a document and outputs a unique semantic representation for every text span irrespective of the content type they are found in. We base our architecture on a graph representation of the structured text, and we demonstrate that not only can we retrieve semantically similar information across documents but also that the embedding space we generate captures useful semantic information, similar to language models that work only on text sequences.

Representing structured text from complex documents typically calls for different machine learning techniques, such as language models for paragraphs and convolutional neural networks (CNNs) for table extraction, which prohibits drawing links between text spans from different content types. In this article we propose a model that approximates the human reading pattern of a document and outputs a unique semantic representation for every text span irrespective of the content type they are found in. We base our architecture on a graph representation of the structured text, and we demonstrate that not only can we retrieve semantically similar information across documents but also that the embedding space we generate captures useful semantic information, similar to language models that work only on text sequences.

下载PDF全文

下载文献需遵守相关版权规定

论文标题