论文标题

标签和正确:提出两阶段解码的问题意识开放信息提取

Tag and Correct: Question aware Open Information Extraction with Two-stage Decoding

论文作者

Kuo, Martin, Liang, Yaobo, Ji, Lei, Duan, Nan, Shou, Linjun, Gong, Ming, Chen, Peng

论文摘要

问题意识到开放信息提取(问题意识开放IE)将问题和通过作为输入,输出一个包含主题,谓词和一个或多个参数的答案元组。每个答案的字段都是自然语言单词序列,并从段落中提取。半结构化答案具有两个优点,与跨度答案相比,它们更可读性和可伪造。有两种解决这个问题的方法。一种是一种提取方法,它通过开放的IE模型从段落中提取候选答案,并通过与问题匹配来对其进行排名。它在提取步骤中完全使用了段落信息,但是提取与问题无关。另一个是生成方法,该方法使用序列来序列模型直接生成答案。它同时将问题和段落与输入相结合,但是从头开始生成答案,该答案并不使用大多数答案单词来自段落中的事实。为了通过段落指导一代,我们提出了一个两个阶段解码模型,该模型包含标记解码器和一个校正解码器。在第一阶段,标记解码器将从段落中标记关键字。在第二阶段,校正解码器将基于标记的关键字生成答案。尽管有两个阶段,但我们的模型可以端到端训练。与以前的生成模型相比,我们通过生成粗到罚来生成更好的答案。我们在WebAssertions上评估了我们的模型(Yan等,2018),这是一个意识到开放IE数据集的问题。我们的模型达到的BLEU得分为59.32,比以前的生成方法更好。

Question Aware Open Information Extraction (Question aware Open IE) takes question and passage as inputs, outputting an answer tuple which contains a subject, a predicate, and one or more arguments. Each field of answer is a natural language word sequence and is extracted from the passage. The semi-structured answer has two advantages which are more readable and falsifiable compared to span answer. There are two approaches to solve this problem. One is an extractive method which extracts candidate answers from the passage with the Open IE model, and ranks them by matching with questions. It fully uses the passage information at the extraction step, but the extraction is independent to the question. The other one is the generative method which uses a sequence to sequence model to generate answers directly. It combines the question and passage as input at the same time, but it generates the answer from scratch, which does not use the facts that most of the answer words come from in the passage. To guide the generation by passage, we present a two-stage decoding model which contains a tagging decoder and a correction decoder. At the first stage, the tagging decoder will tag keywords from the passage. At the second stage, the correction decoder will generate answers based on tagged keywords. Our model could be trained end-to-end although it has two stages. Compared to previous generative models, we generate better answers by generating coarse to fine. We evaluate our model on WebAssertions (Yan et al., 2018) which is a Question aware Open IE dataset. Our model achieves a BLEU score of 59.32, which is better than previous generative methods.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源