超声心动图视频的半监视射血分数预测的周期性自我审视

论文标题

超声心动图视频的半监视射血分数预测的周期性自我审视

Cyclical Self-Supervision for Semi-Supervised Ejection Fraction Prediction from Echocardiogram Videos

论文作者

Dai, Weihang, Li, Xiaomeng, Ding, Xinpeng, Cheng, Kwang-Ting

论文摘要

左心室射血分数（LVEF）是心力衰竭的重要指标。从视频中进行LVEF估算的现有方法需要大量带注释的数据才能实现高性能，例如使用10,030个标记的超声心动图视频，达到4.10的平均绝对误差（MAE）。标记这些视频很耗时，但是将潜在的下游应用程序限制为其他心脏病。本文介绍了LVEF预测的第一种半监督方法。与一般视频预测任务不同，LVEF预测与超声心动图视频中左心室（LV）的变化特别相关。通过将从LV分段中学到的知识纳入LVEF回归中，我们可以为模型提供其他上下文以进行更好的预测。为此，我们提出了一种用于学习基于视频的LV分割的新型周期性自学方法（CSS）方法，这是由于观察到心跳是具有时间重复的周期性过程。然后，我们分割模型的预测掩模可以用作LVEF回归的附加输入，以为LV区域提供空间上下文。我们还介绍了教师学生蒸馏，以将信息从LV分割掩码提取到仅需要视频输入的端到端LVEF回归模型。结果表明，使用标签的一半数量，我们的方法表现优于替代性半监督方法，并且可以实现4.17的MAE，这与最先进的监督性能具有竞争力。外部数据集上的验证还显示出使用我们的方法的提高概括能力。我们的代码可从https://github.com/xmed-lab/css-semivideo获得。

Left-ventricular ejection fraction (LVEF) is an important indicator of heart failure. Existing methods for LVEF estimation from video require large amounts of annotated data to achieve high performance, e.g. using 10,030 labeled echocardiogram videos to achieve mean absolute error (MAE) of 4.10. Labeling these videos is time-consuming however and limits potential downstream applications to other heart diseases. This paper presents the first semi-supervised approach for LVEF prediction. Unlike general video prediction tasks, LVEF prediction is specifically related to changes in the left ventricle (LV) in echocardiogram videos. By incorporating knowledge learned from predicting LV segmentations into LVEF regression, we can provide additional context to the model for better predictions. To this end, we propose a novel Cyclical Self-Supervision (CSS) method for learning video-based LV segmentation, which is motivated by the observation that the heartbeat is a cyclical process with temporal repetition. Prediction masks from our segmentation model can then be used as additional input for LVEF regression to provide spatial context for the LV region. We also introduce teacher-student distillation to distill the information from LV segmentation masks into an end-to-end LVEF regression model that only requires video inputs. Results show our method outperforms alternative semi-supervised methods and can achieve MAE of 4.17, which is competitive with state-of-the-art supervised performance, using half the number of labels. Validation on an external dataset also shows improved generalization ability from using our method. Our code is available at https://github.com/xmed-lab/CSS-SemiVideo.

下载PDF全文

下载文献需遵守相关版权规定

论文标题