论文标题
E2E精制数据集
E2E Refined Dataset
论文作者
论文摘要
尽管许多研究人员都使用了众所周知的MR到文本E2E数据集,但其MR文本对包括许多删除/插入/替代错误。由于此类错误会影响MR到文本系统的质量,因此必须尽可能固定它们。因此,我们开发了一个精制的数据集和一些Python程序,这些程序将原始E2E数据集转换为精制数据集。
Although the well-known MR-to-text E2E dataset has been used by many researchers, its MR-text pairs include many deletion/insertion/substitution errors. Since such errors affect the quality of MR-to-text systems, they must be fixed as much as possible. Therefore, we developed a refined dataset and some python programs that convert the original E2E dataset into a refined dataset.