多任务学习的名义隐喻生成

论文标题

多任务学习的名义隐喻生成

Nominal Metaphor Generation with Multitask Learning

论文作者

Li, Yucheng, Lin, Chenghua, Geurin, Frank

论文摘要

隐喻生成是一项具有挑战性的任务，它可能会影响许多下游任务，例如提高用户对对话系统和故事的满意度。本文通过引入具有自训练和隐喻识别机制的多任比喻生成框架来解决中国名义上隐喻生成的问题。自训练解决了隐喻数据集的数据稀缺问题。也就是说，自我训练不仅仅依靠通常尺寸较小的标记的隐喻数据集，而是有助于从大规模的无标记语料库中识别潜在的隐喻来产生隐喻。隐喻加权机制使我们的模型能够专注于模型学习过程中输入的隐喻相关部分（例如，隐喻和比较器的比较），从而提高了生成的隐喻的隐喻性。我们的模型对带有6.3k句子的注释语料库进行了培训，该句子包含多种隐喻表达式。实验结果表明，与基线模型相比，即使在训练数据不足的情况下，我们的模型也能够产生具有更好可读性和创造力的隐喻。

Metaphor generation is a challenging task which can impact many downstream tasks such as improving user satisfaction with dialogue systems and story generation. This paper tackles the problem of Chinese nominal metaphor generation by introducing a multitask metaphor generation framework with self-training and metaphor identification mechanisms. Self-training addresses the data scarcity issue of metaphor datasets. That is, instead of solely relying on labelled metaphor datasets which are usually small in size, self-training helps identify potential metaphors from a large-scale unlabelled corpus for metaphor generation. The metaphor weighting mechanism enables our model to focus on the metaphor-related parts of the input (e.g., the comparison of the metaphor and comparator) during model learning and thus improves the metaphoricity of the generated metaphors. Our model is trained on an annotated corpus consisting of 6.3k sentences that contain diverse metaphorical expressions. Experimental results show that our model is able to generate metaphors with better readability and creativity compared to the baseline models, even in the situation where training data is insufficient.

下载PDF全文

下载文献需遵守相关版权规定

论文标题