论文标题

对医学图像分割的优化:以骰子评分或Jaccard指数评估时的理论和实践

Optimization for Medical Image Segmentation: Theory and Practice when evaluating with Dice Score or Jaccard Index

论文作者

Eelbode, Tom, Bertels, Jeroen, Berman, Maxim, Vandermeulen, Dirk, Maes, Frederik, Bisschops, Raf, Blaschko, Matthew B.

论文摘要

在许多医学成像和经典的计算机视觉任务中,骰子得分和jaccard索引用于评估细分性能。尽管对度量敏感的损失存在和巨大的经验成功,即对这些指标的放松,例如软骰子,软jaccard和lovasz-softmax,但许多研究人员仍然使用每金素损失,例如(加权)交叉透镜来训练CNN进行序列进行分割。因此,在许多情况下,目标度量未直接优化。我们从理论的角度研究了指标敏感损失函数组内的关系,并质疑加权跨层的最佳加权方案的存在,以优化测试时间的骰子分数和Jaccard指数。我们发现,骰子得分和Jaccard指数相对近似于彼此,但我们发现加权锤子相似性没有这样的近似值。对于Tversky的损失,近似在偏离琐事Tversky等于软骰子的琐碎重量设置时,近似值会变得更糟。我们通过对六项医学分割任务进行广泛验证,从经验上验证这些结果,并可以确认,对于以骰子评分或JACCARD指数进行评估,对度量敏感的损失优于基于跨透明的损失函数。这进一步存在于多类设置,以及不同的对象大小和前景/背景比率。这些结果鼓励在医疗分割任务中更广泛地采用指标敏感的损失函数,在这种情况下,感兴趣的绩效量度是骰子分数或jaccard指数。

In many medical imaging and classical computer vision tasks, the Dice score and Jaccard index are used to evaluate the segmentation performance. Despite the existence and great empirical success of metric-sensitive losses, i.e. relaxations of these metrics such as soft Dice, soft Jaccard and Lovasz-Softmax, many researchers still use per-pixel losses, such as (weighted) cross-entropy to train CNNs for segmentation. Therefore, the target metric is in many cases not directly optimized. We investigate from a theoretical perspective, the relation within the group of metric-sensitive loss functions and question the existence of an optimal weighting scheme for weighted cross-entropy to optimize the Dice score and Jaccard index at test time. We find that the Dice score and Jaccard index approximate each other relatively and absolutely, but we find no such approximation for a weighted Hamming similarity. For the Tversky loss, the approximation gets monotonically worse when deviating from the trivial weight setting where soft Tversky equals soft Dice. We verify these results empirically in an extensive validation on six medical segmentation tasks and can confirm that metric-sensitive losses are superior to cross-entropy based loss functions in case of evaluation with Dice Score or Jaccard Index. This further holds in a multi-class setting, and across different object sizes and foreground/background ratios. These results encourage a wider adoption of metric-sensitive loss functions for medical segmentation tasks where the performance measure of interest is the Dice score or Jaccard index.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源