论文标题
猛烈启发的同时上下文化和解释为增量对话句子
SLAM-Inspired Simultaneous Contextualization and Interpreting for Incremental Conversation Sentences
论文作者
论文摘要
单词的分布式表示可以改善许多自然语言任务的性能。但是,在许多方法中,一个单词的标签仅考虑一个含义,并且很少处理多个单词的多种含义。尽管研究工作已经处理了多义单词,但他们根据一批大型文件来确定此类词的含义。因此,将这些方法应用于顺序句子有两个问题,例如包含模棱两可的表达式的对话中。第一个问题是,这些方法无法顺序处理上下文解释之间的相互依存关系,在上下文中由单词解释决定,而单词解释是由上下文决定的。因此,必须按照追求多种解释进行上下文估计。第二个问题是,以前的方法使用大规模的句子集来离线学习新解释,并且清楚地将学习和推理的步骤分开。使用离线学习的方法无法在对话期间获得新的解释。因此,为了动态地估算顺序句子中多义单词的对话上下文,我们提出了一种基于传统同时定位和映射(SLAM)算法的同时上下文化和解释(SCAIN)的方法。通过使用SCAIN算法,我们可以在在线获得新的解释时,在上下文和单词解释之间依赖性相互依存。为了进行实验评估,我们创建了两个数据集:一个来自Wikipedia的歧义页面,另一个来自真实对话。对于两个数据集,结果证实SCAIN可以有效地实现对新解释的相互依赖性和获取的顺序优化。
Distributed representation of words has improved the performance for many natural language tasks. In many methods, however, only one meaning is considered for one label of a word, and multiple meanings of polysemous words depending on the context are rarely handled. Although research works have dealt with polysemous words, they determine the meanings of such words according to a batch of large documents. Hence, there are two problems with applying these methods to sequential sentences, as in a conversation that contains ambiguous expressions. The first problem is that the methods cannot sequentially deal with the interdependence between context and word interpretation, in which context is decided by word interpretations and the word interpretations are decided by the context. Context estimation must thus be performed in parallel to pursue multiple interpretations. The second problem is that the previous methods use large-scale sets of sentences for offline learning of new interpretations, and the steps of learning and inference are clearly separated. Such methods using offline learning cannot obtain new interpretations during a conversation. Hence, to dynamically estimate the conversation context and interpretations of polysemous words in sequential sentences, we propose a method of Simultaneous Contextualization And INterpreting (SCAIN) based on the traditional Simultaneous Localization And Mapping (SLAM) algorithm. By using the SCAIN algorithm, we can sequentially optimize the interdependence between context and word interpretation while obtaining new interpretations online. For experimental evaluation, we created two datasets: one from Wikipedia's disambiguation pages and the other from real conversations. For both datasets, the results confirmed that SCAIN could effectively achieve sequential optimization of the interdependence and acquisition of new interpretations.