论文标题

端到端神经第一式话语解析的简单且强大的基线

A Simple and Strong Baseline for End-to-End Neural RST-style Discourse Parsing

论文作者

Kobayashi, Naoki, Hirao, Tsutomu, Kamigaito, Hidetaka, Okumura, Manabu, Nagata, Masaaki

论文摘要

为了促进并进一步发展第式话语解析模型,我们需要一个强大的基线,可以将其视为报告可靠的实验结果的参考。本文通过将现有的简单解析策略(自上而下和自下而上)与各种基于变压器的预训练的语言模型整合在一起,探讨了强大的基准。从两个基准数据集获得的实验结果表明,解析性能强烈依赖于验证的语言模型,而不是解析策略。尤其是,与当前使用Deberta时,自下而上的解析器相比获得了当前最佳解析器的性能。我们进一步揭示了具有跨度掩盖方案的语言模型,尤其是通过我们的分析和多句子解析以及核性预测来提高解析性能。

To promote and further develop RST-style discourse parsing models, we need a strong baseline that can be regarded as a reference for reporting reliable experimental results. This paper explores a strong baseline by integrating existing simple parsing strategies, top-down and bottom-up, with various transformer-based pre-trained language models. The experimental results obtained from two benchmark datasets demonstrate that the parsing performance strongly relies on the pretrained language models rather than the parsing strategies. In particular, the bottom-up parser achieves large performance gains compared to the current best parser when employing DeBERTa. We further reveal that language models with a span-masking scheme especially boost the parsing performance through our analysis within intra- and multi-sentential parsing, and nuclearity prediction.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源