论文标题
根据神经放射学报告标记成像数据集:一项验证研究
Labelling imaging datasets on the basis of neuroradiology reports: a validation study
论文作者
论文摘要
自然语言处理(NLP)显示出有望作为自动化医院尺度神经放射学磁共振成像(MRI)数据集的一种手段,用于计算机视觉应用。然而,迄今为止,还没有对这种方法的有效性进行彻底的研究,包括确定与图像标签相比的报告标签的准确性以及检查非专家标签者的性能。在这项工作中,我们借鉴了一个神经放射学家团队的经验,他们将5000多个MRI神经放射学报告标记为建立专门的基于深度学习的神经放射学报告分类器的项目的一部分。我们表明,根据我们的经验,将二进制标签(即正常与异常)分配给仅报告的图像非常准确。但是,与二进制标签相反,更颗粒状标记的准确性取决于类别,我们重点介绍了这种差异的原因。我们还表明,当非专家执行训练报告的标签时,下游模型性能会降低。为了允许其他研究人员加速他们的研究,我们可以使我们的精致异常定义和标签规则以及易于使用的放射学报告标签应用程序有助于简化此过程。
Natural language processing (NLP) shows promise as a means to automate the labelling of hospital-scale neuroradiology magnetic resonance imaging (MRI) datasets for computer vision applications. To date, however, there has been no thorough investigation into the validity of this approach, including determining the accuracy of report labels compared to image labels as well as examining the performance of non-specialist labellers. In this work, we draw on the experience of a team of neuroradiologists who labelled over 5000 MRI neuroradiology reports as part of a project to build a dedicated deep learning-based neuroradiology report classifier. We show that, in our experience, assigning binary labels (i.e. normal vs abnormal) to images from reports alone is highly accurate. In contrast to the binary labels, however, the accuracy of more granular labelling is dependent on the category, and we highlight reasons for this discrepancy. We also show that downstream model performance is reduced when labelling of training reports is performed by a non-specialist. To allow other researchers to accelerate their research, we make our refined abnormality definitions and labelling rules available, as well as our easy-to-use radiology report labelling app which helps streamline this process.