用于验证人工智能的病理学家注销的数据集：项目描述和试点研究

论文标题

用于验证人工智能的病理学家注销的数据集：项目描述和试点研究

A Pathologist-Annotated Dataset for Validating Artificial Intelligence: A Project Description and Pilot Study

论文作者

Dudgeon, Sarah N, Wen, Si, Hanna, Matthew G, Gupta, Rajarsi, Amgad, Mohamed, Sheth, Manasi, Marble, Hetal, Huang, Richard, Herrmann, Markus D, Szu, Clifford H., Tong, Darick, Werness, Bruce, Szu, Evan, Larsimont, Denis, Madabhushi, Anant, Hytopoulos, Evangelos, Chen, Weijie, Singh, Rajendra, Hart, Steven N., Saltz, Joel, Salgado, Roberto, Gallas, Brandon D

论文摘要

目的：在这项工作中，我们提出了一项协作，以创建一个处理整个幻灯片图像（WSIS）的算法的病理学家注释验证数据集。在估计乳腺癌中基质肿瘤浸润淋巴细胞（Stils）密度的情况下，我们将重点关注数据收集和算法性能的评估。方法：我们在单个临床部位数字化了苏木精和曙红染色的导管癌核心活检的64个玻璃载玻片。我们创建了培训材料和工作流程，以两种模式的众包病理学家图像注释：光学显微镜和两个数字平台。工作流收集ROI类型，这是关于ROI是否适合估计Stil的密度的决定，如果适当的话，该ROI的Stil密度值。结果：试点研究产生了大量的标称性质性高性病病例。此外，我们发现在一个病例中相关性的细胞质密度是相关的，并且有明显的病理学家可变性。因此，我们概述了改善ROI和案例抽样方法的计划。我们还概述了在验证算法时案例和病理学家可变性内的ROI相关性的统计方法。结论：我们已经建立了工作流程以收集有效的数据并在一项试点研究中对其进行了测试。当我们为关键研究做准备时，我们将考虑适合监管目的的数据集需要什么：研究规模，患者人群以及病理学家培训和资格。为此，我们将通过医疗设备开发工具计划以及更广泛的数字病理和AI社区引起FDA的反馈。最终，我们打算分享数据集，统计方法和经验教训。

Purpose: In this work, we present a collaboration to create a validation dataset of pathologist annotations for algorithms that process whole slide images (WSIs). We focus on data collection and evaluation of algorithm performance in the context of estimating the density of stromal tumor infiltrating lymphocytes (sTILs) in breast cancer. Methods: We digitized 64 glass slides of hematoxylin- and eosin-stained ductal carcinoma core biopsies prepared at a single clinical site. We created training materials and workflows to crowdsource pathologist image annotations on two modes: an optical microscope and two digital platforms. The workflows collect the ROI type, a decision on whether the ROI is appropriate for estimating the density of sTILs, and if appropriate, the sTIL density value for that ROI. Results: The pilot study yielded an abundant number of cases with nominal sTIL infiltration. Furthermore, we found that the sTIL densities are correlated within a case, and there is notable pathologist variability. Consequently, we outline plans to improve our ROI and case sampling methods. We also outline statistical methods to account for ROI correlations within a case and pathologist variability when validating an algorithm. Conclusion: We have built workflows for efficient data collection and tested them in a pilot study. As we prepare for pivotal studies, we will consider what it will take for the dataset to be fit for a regulatory purpose: study size, patient population, and pathologist training and qualifications. To this end, we will elicit feedback from the FDA via the Medical Device Development Tool program and from the broader digital pathology and AI community. Ultimately, we intend to share the dataset, statistical methods, and lessons learned.

下载PDF全文

下载文献需遵守相关版权规定

论文标题