论文标题
LayoutXLM与GNN:文档关系提取的经验评估
LayoutXLM vs. GNN: An Empirical Evaluation of Relation Extraction for Documents
论文作者
论文摘要
本文通过对两个不同的神经网络模型进行基准测试文档中的关系提取任务:多模式语言模型(LayoutXLM)和图形神经网络:边缘卷积网络(ECN)。对于此基准,我们使用Xfund数据集,该数据集与LayoutXlm一起发布。尽管两种模型都取得了相似的结果,但它们都表现出截然不同的特征。这就提出了有关如何在神经网络中整合各种方式的问题:通过额外的预处理(LayoutXLM)或以级联的方式(ECN)合并所有模式。最后,我们讨论了一些方法论问题,这些方法对于新数据集和使用复杂文档的信息提取领域中必须考虑这些问题。
This paper investigates the Relation Extraction task in documents by benchmarking two different neural network models: a multi-modal language model (LayoutXLM) and a Graph Neural Network: Edge Convolution Network (ECN). For this benchmark, we use the XFUND dataset, released along with LayoutXLM. While both models reach similar results, they both exhibit very different characteristics. This raises the question on how to integrate various modalities in a neural network: by merging all modalities thanks to additional pretraining (LayoutXLM), or in a cascaded way (ECN). We conclude by discussing some methodological issues that must be considered for new datasets and task definition in the domain of Information Extraction with complex documents.