论文标题
快速且健壮的内窥镜内容区域估计:基于GPU的瘦小管道和策划的基准数据集
Rapid and robust endoscopic content area estimation: A lean GPU-based pipeline and curated benchmark dataset
论文作者
论文摘要
内镜内容区域是指大多数内窥镜镜头中存在的黑暗,非信息边界区域所包围的信息区域。内容区域的估计是内窥镜图像处理和计算机视觉管道中的常见任务。尽管问题显然很简单,但几个因素使可靠的实时估计令人惊讶地具有挑战性。缺乏对该主题的严格调查,再加上缺乏该任务的常见基准数据集是该领域的持久问题。在本文中,我们提出了基于瘦小GPU的计算管道的两个变体,结合了边缘检测和圆形拟合。这两个变体通过依靠手工制作的功能和学习的功能来提取内容区域边缘点候选者而有所不同。我们还提出了一系列手术指示的手动注释和伪标记的内容区域的首个数据集。为了鼓励进一步的发展,已将策划的数据集和两种算法的实现已公开(https://doi.org/10.7303/syn32148000,https://github.com/charliebudd/charliebudd/torch-content-area)。我们将提出的算法与最新的U-NET方法进行比较,并在准确性(Hausdorff距离:6.3 PX与118.1 PX)和计算时间(平均每帧运行时:0.13 ms对11.2 ms)方面表现出显着提高。
Endoscopic content area refers to the informative area enclosed by the dark, non-informative, border regions present in most endoscopic footage. The estimation of the content area is a common task in endoscopic image processing and computer vision pipelines. Despite the apparent simplicity of the problem, several factors make reliable real-time estimation surprisingly challenging. The lack of rigorous investigation into the topic combined with the lack of a common benchmark dataset for this task has been a long-lasting issue in the field. In this paper, we propose two variants of a lean GPU-based computational pipeline combining edge detection and circle fitting. The two variants differ by relying on handcrafted features, and learned features respectively to extract content area edge point candidates. We also present a first-of-its-kind dataset of manually annotated and pseudo-labelled content areas across a range of surgical indications. To encourage further developments, the curated dataset, and an implementation of both algorithms, has been made public (https://doi.org/10.7303/syn32148000, https://github.com/charliebudd/torch-content-area). We compare our proposed algorithm with a state-of-the-art U-Net-based approach and demonstrate significant improvement in terms of both accuracy (Hausdorff distance: 6.3 px versus 118.1 px) and computational time (Average runtime per frame: 0.13 ms versus 11.2 ms).