论文标题
使用Graph Transformer网络在复杂文档布局中对文本的语义进行建模
Modelling the semantics of text in complex document layouts using graph transformer networks
论文作者
论文摘要
代表复杂文档的结构化文本通常需要采用不同的机器学习技术,例如用于表提取的段落和卷积神经网络(CNN)的语言模型,这禁止从不同内容类型的文本跨度之间进行绘制绘制联系。 In this article we propose a model that approximates the human reading pattern of a document and outputs a unique semantic representation for every text span irrespective of the content type they are found in. We base our architecture on a graph representation of the structured text, and we demonstrate that not only can we retrieve semantically similar information across documents but also that the embedding space we generate captures useful semantic information, similar to language models that work only on text sequences.
Representing structured text from complex documents typically calls for different machine learning techniques, such as language models for paragraphs and convolutional neural networks (CNNs) for table extraction, which prohibits drawing links between text spans from different content types. In this article we propose a model that approximates the human reading pattern of a document and outputs a unique semantic representation for every text span irrespective of the content type they are found in. We base our architecture on a graph representation of the structured text, and we demonstrate that not only can we retrieve semantically similar information across documents but also that the embedding space we generate captures useful semantic information, similar to language models that work only on text sequences.