论文标题
通过删除对不存在的先验的幻觉引用来改善放射学报告生成系统
Improving Radiology Report Generation Systems by Removing Hallucinated References to Non-existent Priors
论文作者
论文摘要
当前的深度学习模型经过培训以生成胸部X光片的放射学报告,能够产生临床准确,清晰且可操作的文本,以促进患者护理。但是,这样的系统都屈服于同一问题:对不存在的先验报告幻觉。之所以发生这样的幻觉,是因为这些模型是在现实世界中固有地指先验的现实世界患者报告的数据集上培训的。为此,我们提出了两种方法,以删除放射学报告中对先验的参考:(1)基于GPT-3的几种方法,可以重写医疗报告,而无需提及先验; (2)一种基于生物的令牌分类方法,可以直接删除涉及先验的单词。我们使用上述方法来修改MIMIC-CXR,这是胸部X射线的公开数据集及其相关的自由文本放射学报告;然后,我们在适应的MIMIC-CXR数据集上重新训练是放射报告生成系统CXR-Repair。我们发现,我们称为CXR-REDONE的重新训练模型以前的报表生成方法对临床指标的先前生成方法达到了0.2351的平均BERTSCORE(绝对改善2.57%)。我们希望我们的方法在使当前的放射学报告生成系统更直接地整合到临床管道中,将具有广泛的价值。
Current deep learning models trained to generate radiology reports from chest radiographs are capable of producing clinically accurate, clear, and actionable text that can advance patient care. However, such systems all succumb to the same problem: making hallucinated references to non-existent prior reports. Such hallucinations occur because these models are trained on datasets of real-world patient reports that inherently refer to priors. To this end, we propose two methods to remove references to priors in radiology reports: (1) a GPT-3-based few-shot approach to rewrite medical reports without references to priors; and (2) a BioBERT-based token classification approach to directly remove words referring to priors. We use the aforementioned approaches to modify MIMIC-CXR, a publicly available dataset of chest X-rays and their associated free-text radiology reports; we then retrain CXR-RePaiR, a radiology report generation system, on the adapted MIMIC-CXR dataset. We find that our re-trained model--which we call CXR-ReDonE--outperforms previous report generation methods on clinical metrics, achieving an average BERTScore of 0.2351 (2.57% absolute improvement). We expect our approach to be broadly valuable in enabling current radiology report generation systems to be more directly integrated into clinical pipelines.