论文标题
通过语义奖励提高放射学报告生成的事实正确性
Improving the Factual Correctness of Radiology Report Generation with Semantic Rewards
论文作者
论文摘要
神经图像到文本放射学报告生成系统通过减少报告起草的重复过程并识别可能的医疗错误,从而有可能改善放射学报告。这些系统已通过广泛使用的NLG指标(例如BLEU和CIDER)来实现有希望的性能。但是,当前系统面临重要的局限性。首先,他们提出了建筑的复杂性,仅提供NLG指标的边际改进。其次,由于培训和评估不足,这些在这些指标上实现高性能的系统并不总是真正完整或一致。最近的研究表明,通过使用鼓励参考的域实体的产生以及2)以推论一致的方式描述这些实体的生成域实体的产生可以实质上改进系统。到目前为止,这些方法依赖于弱监督的方法(基于规则)和非特定于胸部X射线域的指定实体识别系统。为了克服这一局限性,我们提出了一种新方法,即Radgraph奖励,以进一步提高生成的放射学报告的事实完整性和正确性。更确切地说,我们利用了带有带注释的胸部X射线报告的Radgraph数据集,该报告具有实体和实体之间的关系。在两个开放放射学报告数据集中,我们的系统在评估报告的事实正确性和完整性的指标上大大提高了分数高达14.2%和25.3%。
Neural image-to-text radiology report generation systems offer the potential to improve radiology reporting by reducing the repetitive process of report drafting and identifying possible medical errors. These systems have achieved promising performance as measured by widely used NLG metrics such as BLEU and CIDEr. However, the current systems face important limitations. First, they present an increased complexity in architecture that offers only marginal improvements on NLG metrics. Secondly, these systems that achieve high performance on these metrics are not always factually complete or consistent due to both inadequate training and evaluation. Recent studies have shown the systems can be substantially improved by using new methods encouraging 1) the generation of domain entities consistent with the reference and 2) describing these entities in inferentially consistent ways. So far, these methods rely on weakly-supervised approaches (rule-based) and named entity recognition systems that are not specific to the chest X-ray domain. To overcome this limitation, we propose a new method, the RadGraph reward, to further improve the factual completeness and correctness of generated radiology reports. More precisely, we leverage the RadGraph dataset containing annotated chest X-ray reports with entities and relations between entities. On two open radiology report datasets, our system substantially improves the scores up to 14.2% and 25.3% on metrics evaluating the factual correctness and completeness of reports.