论文标题
plog:逻辑上的桌面概括逻辑桌面到文本生成
PLOG: Table-to-Logic Pretraining for Logical Table-to-Text Generation
论文作者
论文摘要
逻辑表格到文本生成是一项任务,涉及从表中生成逻辑上忠实的句子,该句子需要模型通过逻辑推断从表记录中得出逻辑级别的事实。它在桌面到文本模型的逻辑级内容计划上提出了新的挑战。但是,由于自然语言的歧义和并行数据的稀缺性,直接从表文本对学习逻辑推理知识非常困难。因此,即使是大规模的预训练的语言模型,逻辑表对文本上也呈现低逻辑保真度。在这项工作中,我们提出了一个plog(验证的逻辑形式生成器)框架,以提高产生忠诚度。具体而言,plog首先是在桌面形式的生成(桌面到逻辑)任务上预估计的,然后在下游桌面到文本任务上进行了填充。逻辑形式的形式定义使我们能够从没有人类注释的情况下从表中收集大量准确的逻辑形式。此外,与表文本对相比,PLOG可以从表格对中学习逻辑推断。为了评估我们的模型,我们进一步根据现有数据集收集了受控的逻辑表与文本数据集contlog。在两个基准(Logicnlg和contlog)上,在逻辑上的忠诚度上占优于强大的基准,这表明了桌面到逻辑预科的有效性。
Logical table-to-text generation is a task that involves generating logically faithful sentences from tables, which requires models to derive logical level facts from table records via logical inference. It raises a new challenge on the logical-level content planning of table-to-text models. However, directly learning the logical inference knowledge from table-text pairs is very difficult for neural models because of the ambiguity of natural language and the scarcity of parallel data. Hence even large-scale pre-trained language models present low logical fidelity on logical table-to-text. In this work, we propose a PLOG (Pretrained Logical Form Generator) framework to improve the generation fidelity. Specifically, PLOG is first pretrained on a table-to-logic-form generation (table-to-logic) task, then finetuned on downstream table-to-text tasks. The formal definition of logical forms enables us to collect large amount of accurate logical forms from tables without human annotation. In addition, PLOG can learn logical inference from table-logic pairs much more definitely than from table-text pairs. To evaluate our model, we further collect a controlled logical table-to-text dataset CONTLOG based on an existing dataset. On two benchmarks, LOGICNLG and CONTLOG, PLOG outperforms strong baselines by a large margin on the logical fidelity, demonstrating the effectiveness of table-to-logic pretraining.