论文标题
使用依赖信息的随机自然语言生成
Stochastic Natural Language Generation Using Dependency Information
论文作者
论文摘要
本文提出了一个基于随机语料库的模型,用于生成自然语言文本。我们的模型首先通过功能集来编码从训练数据的依赖关系,然后将这些功能串联以为给定的含义表示形式产生新的依赖树,并最终从产生的依赖性树中生成了自然语言话语。我们通过表对话法和RDF格式对九个领域进行测试。我们的模型优于基于语料库的最先进方法,该方法在表格数据集中训练有素,并且通过对对话ACT训练的基于神经网络的方法,E2E和WebNLG数据集,用于BLEU和ERR评估指标。同样,通过报告人类评估结果,我们表明我们的模型在信息性和自然性和质量方面产生了高质量的话语。
This article presents a stochastic corpus-based model for generating natural language text. Our model first encodes dependency relations from training data through a feature set, then concatenates these features to produce a new dependency tree for a given meaning representation, and finally generates a natural language utterance from the produced dependency tree. We test our model on nine domains from tabular, dialogue act and RDF format. Our model outperforms the corpus-based state-of-the-art methods trained on tabular datasets and also achieves comparable results with neural network-based approaches trained on dialogue act, E2E and WebNLG datasets for BLEU and ERR evaluation metrics. Also, by reporting Human Evaluation results, we show that our model produces high-quality utterances in aspects of informativeness and naturalness as well as quality.