使用增强学习检索手术期过渡

论文标题

使用增强学习检索手术期过渡

Retrieval of surgical phase transitions using reinforcement learning

论文作者

Zhang, Yitong, Bano, Sophia, Page, Ann-Sophie, Deprest, Jan, Stoyanov, Danail, Vasconcelos, Francisco

论文摘要

在微创手术中，视频分析的手术工作流程分割是一个经过深入研究的主题。常规方法将其定义为多类分类问题，其中各个视频帧被归因于手术期标签。我们引入了一种新颖的加固学习公式，以用于离线相过渡检索。我们没有尝试对每个视频框架进行分类，而是确定每个相变的时间戳。通过构造，我们的模型不会产生虚假和嘈杂的相变，而是相邻的相位块。我们研究了该模型的两种不同配置。第一个不需要在视频中处理所有帧（在2个不同的应用程序中只有<60％和<20％的框架），而在最先进的准确性下略微产生结果。第二个配置处理所有视频帧，并以可比的计算成本优于最先进的框架。我们将我们的方法与最近基于框架的最佳框架方法Tecno和Trans-Svnet在公共数据集Cholec80上以及腹腔镜propocococopopexy的内部数据集进行了比较。我们同时执行基于框架的（准确性，精度，召回和F1得分），也可以对我们的算法进行基于事件的（事件比率）评估。

In minimally invasive surgery, surgical workflow segmentation from video analysis is a well studied topic. The conventional approach defines it as a multi-class classification problem, where individual video frames are attributed a surgical phase label. We introduce a novel reinforcement learning formulation for offline phase transition retrieval. Instead of attempting to classify every video frame, we identify the timestamp of each phase transition. By construction, our model does not produce spurious and noisy phase transitions, but contiguous phase blocks. We investigate two different configurations of this model. The first does not require processing all frames in a video (only <60% and <20% of frames in 2 different applications), while producing results slightly under the state-of-the-art accuracy. The second configuration processes all video frames, and outperforms the state-of-the art at a comparable computational cost. We compare our method against the recent top-performing frame-based approaches TeCNO and Trans-SVNet on the public dataset Cholec80 and also on an in-house dataset of laparoscopic sacrocolpopexy. We perform both a frame-based (accuracy, precision, recall and F1-score) and an event-based (event ratio) evaluation of our algorithms.

下载PDF全文

下载文献需遵守相关版权规定

论文标题