论文标题

E2E精制数据集

E2E Refined Dataset

论文作者

Toyama, Keisuke, Sudoh, Katsuhito, Nakamura, Satoshi

论文摘要

尽管许多研究人员都使用了众所周知的MR到文本E2E数据集,但其MR文本对包括许多删除/插入/替代错误。由于此类错误会影响MR到文本系统的质量,因此必须尽可能固定它们。因此,我们开发了一个精制的数据集和一些Python程序,这些程序将原始E2E数据集转换为精制数据集。

Although the well-known MR-to-text E2E dataset has been used by many researchers, its MR-text pairs include many deletion/insertion/substitution errors. Since such errors affect the quality of MR-to-text systems, they must be fixed as much as possible. Therefore, we developed a refined dataset and some python programs that convert the original E2E dataset into a refined dataset.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源