论文标题
VCDM:利用变异双重编码和深层上下文化的单词表示,以改进定义建模
VCDM: Leveraging Variational Bi-encoding and Deep Contextualized Word Representations for Improved Definition Modeling
论文作者
论文摘要
在本文中,我们解决了定义建模的任务,目标是学习生成单词和短语的定义。此任务的现有方法是歧视性的,以隐式而不是直接的方式结合了分布和词汇语义。为了解决此问题,我们为任务提出了一个生成模型,引入了一个连续的潜在变量,以明确对上下文中使用的短语及其定义进行建模。我们依靠变异推断来估算和利用上下文化的单词嵌入以提高性能。通过添加了两个新数据集“ Cambridge”和第一个非英语语料库“ Robert”,我们对我们的四个现有具有挑战性的基准进行了评估,我们将其释放出来,以补充我们的经验研究。我们的各种上下文定义建模者(VCDM)在自动和人类评估指标方面实现了最先进的绩效,证明了我们方法的有效性。
In this paper, we tackle the task of definition modeling, where the goal is to learn to generate definitions of words and phrases. Existing approaches for this task are discriminative, combining distributional and lexical semantics in an implicit rather than direct way. To tackle this issue we propose a generative model for the task, introducing a continuous latent variable to explicitly model the underlying relationship between a phrase used within a context and its definition. We rely on variational inference for estimation and leverage contextualized word embeddings for improved performance. Our approach is evaluated on four existing challenging benchmarks with the addition of two new datasets, "Cambridge" and the first non-English corpus "Robert", which we release to complement our empirical study. Our Variational Contextual Definition Modeler (VCDM) achieves state-of-the-art performance in terms of automatic and human evaluation metrics, demonstrating the effectiveness of our approach.