元数据增强对比度从视网膜光学相干断层扫描图像

论文标题

元数据增强对比度从视网膜光学相干断层扫描图像

Metadata-enhanced contrastive learning from retinal optical coherence tomography images

论文作者

Holland, Robbie, Leingang, Oliver, Bogunović, Hrvoje, Riedl, Sophie, Fritsche, Lars, Prevost, Toby, Scholl, Hendrik P. N., Schmidt-Erfurth, Ursula, Sivaprasad, Sobha, Lotery, Andrew J., Rueckert, Daniel, Menten, Martin J.

论文摘要

深度学习有可能自动化医学图像中疾病的筛查，监测和分级。通过对比度学习进行预处理，使模型能够从自然图像数据集中提取可靠和可推广的特征，从而促进标签有效的下游图像分析。但是，将常规对比方法直接应用于医疗数据集引入了两个特定领域的问题。首先，已证明对有效对比度学习至关重要的几个图像转换不会从自然图像转换为医学图像域。其次，传统方法的假设是，任何两个图像都是不同的，在描绘相同解剖结构和疾病的医学数据集中是系统地误导的。这在纵向图像数据集中加剧了这一点，这些数据集反复对同一患者队列进行图像以监测其疾病进展。在本文中，我们通过通过新颖的元数据增强策略扩展常规的对比框架来解决这些问题。我们的方法采用广泛可用的患者元数据来近似一组真实形象的对比关系。为此，我们采用了患者身份，眼睛位置（即左或右）和时间序列信息的记录。在使用两个大型纵向数据集的实验中，其中包含7,912例与年龄相关的黄斑变性（AMD）患者的170,427个视网膜OCT图像，我们评估了使用元数据将疾病进展的时间动力学结合到预训练中的实用性。我们的元数据增强方法在六个与AMD相关的六个图像级下游任务中的五分之五，都超过了标准对比方法和视网膜图像基础模型。由于其模块化，我们的方法可以快速和成本效益测试，以确定在对比预训练中包括可用元数据的潜在优势。

Deep learning has potential to automate screening, monitoring and grading of disease in medical images. Pretraining with contrastive learning enables models to extract robust and generalisable features from natural image datasets, facilitating label-efficient downstream image analysis. However, the direct application of conventional contrastive methods to medical datasets introduces two domain-specific issues. Firstly, several image transformations which have been shown to be crucial for effective contrastive learning do not translate from the natural image to the medical image domain. Secondly, the assumption made by conventional methods, that any two images are dissimilar, is systematically misleading in medical datasets depicting the same anatomy and disease. This is exacerbated in longitudinal image datasets that repeatedly image the same patient cohort to monitor their disease progression over time. In this paper we tackle these issues by extending conventional contrastive frameworks with a novel metadata-enhanced strategy. Our approach employs widely available patient metadata to approximate the true set of inter-image contrastive relationships. To this end we employ records for patient identity, eye position (i.e. left or right) and time series information. In experiments using two large longitudinal datasets containing 170,427 retinal OCT images of 7,912 patients with age-related macular degeneration (AMD), we evaluate the utility of using metadata to incorporate the temporal dynamics of disease progression into pretraining. Our metadata-enhanced approach outperforms both standard contrastive methods and a retinal image foundation model in five out of six image-level downstream tasks related to AMD. Due to its modularity, our method can be quickly and cost-effectively tested to establish the potential benefits of including available metadata in contrastive pretraining.

下载PDF全文

下载文献需遵守相关版权规定

论文标题