论文标题
CHEXPERT ++:近似Chexpert标签以获得速度,可怜性和概率输出
CheXpert++: Approximating the CheXpert labeler for Speed,Differentiability, and Probabilistic Output
论文作者
论文摘要
获得医疗数据的地面真相标签通常是不可行的或不可能的。为了避免这种情况,可以建立基于规则的或其他专家知识驱动的标签器来摄取数据并产生银标签,而没有任何地面真相培训数据。一位流行的标签是Chexpert,它是一种标签,可生产用于胸部X射线放射学报告的诊断标签。 CHEXPERT非常有用,但是相对较慢,尤其是当与端到端神经管道集成时,不可差异化,因此不能在任何需要梯度流过标签的应用中使用,并且不会产生概率的输出,这限制了我们通过诸如主动学习等技术提高银牌质量的能力。 在这项工作中,我们使用$ \ texttt {chexpert ++} $解决了所有这三个问题,这是一个基于BERT的高保真近似Chexpert。 $ \ texttt {chexpert ++} $与chexpert达到99.81 \%奇偶校验,这意味着它可以可靠用作CHEXPERT的倒入替换,同时在输出中的速度明显更快,完全可区分和概率。 Error analysis of $\texttt{CheXpert++}$ also demonstrates that $\texttt{CheXpert++}$ has a tendency to actually correct errors in the CheXpert labels, with $\texttt{CheXpert++}$ labels being more often preferred by a clinician over CheXpert labels (when they disagree) on all but one disease task.为了进一步证明该模型中这些优势的实用性,我们进行了概念验证的主动学习研究,证明我们可以通过使用主动学习的主动学习启发的重新训练来提高约8%的专家标记的报告句子的随机子集的准确性。这些发现表明,在共同学习和主动学习中的简单技术可以在最小和可控的人类标签需求下产生高质量的标签。
It is often infeasible or impossible to obtain ground truth labels for medical data. To circumvent this, one may build rule-based or other expert-knowledge driven labelers to ingest data and yield silver labels absent any ground-truth training data. One popular such labeler is CheXpert, a labeler that produces diagnostic labels for chest X-ray radiology reports. CheXpert is very useful, but is relatively computationally slow, especially when integrated with end-to-end neural pipelines, is non-differentiable so can't be used in any applications that require gradients to flow through the labeler, and does not yield probabilistic outputs, which limits our ability to improve the quality of the silver labeler through techniques such as active learning. In this work, we solve all three of these problems with $\texttt{CheXpert++}$, a BERT-based, high-fidelity approximation to CheXpert. $\texttt{CheXpert++}$ achieves 99.81\% parity with CheXpert, which means it can be reliably used as a drop-in replacement for CheXpert, all while being significantly faster, fully differentiable, and probabilistic in output. Error analysis of $\texttt{CheXpert++}$ also demonstrates that $\texttt{CheXpert++}$ has a tendency to actually correct errors in the CheXpert labels, with $\texttt{CheXpert++}$ labels being more often preferred by a clinician over CheXpert labels (when they disagree) on all but one disease task. To further demonstrate the utility of these advantages in this model, we conduct a proof-of-concept active learning study, demonstrating we can improve accuracy on an expert labeled random subset of report sentences by approximately 8\% over raw, unaltered CheXpert by using one-iteration of active-learning inspired re-training. These findings suggest that simple techniques in co-learning and active learning can yield high-quality labelers under minimal, and controllable human labeling demands.