论文标题

通过利用外部知识从临床文本中提取表型的无监督数值推理

Unsupervised Numerical Reasoning to Extract Phenotypes from Clinical Text by Leveraging External Knowledge

论文作者

Tanwar, Ashwani, Zhang, Jingqing, Ive, Julia, Gupta, Vibhor, Guo, Yike

论文摘要

从临床文本中提取表型已被证明可用于多种临床用例,例如鉴定患有罕见疾病的患者。然而,数值值的推理对于临床文本中的表型仍然具有挑战性,例如代表发烧的温度102F。当前的最新表型模型能够检测一般表型,但在检测需要数值推理的表型时表现较差。我们提出了一种新颖的无监督方法,利用外部知识和来自Clinicalbert的上下文化单词嵌入在各种表型环境中的数值推理。与无监督的基准相比,它显示出可观的性能提高,并且在广义召回方面的绝对增长和F1得分分别高达79%和71%。在监督的环境中,它还超过了替代方法的性能,在广义召回方面的绝对增长和F1分别得分高达70%和44%。

Extracting phenotypes from clinical text has been shown to be useful for a variety of clinical use cases such as identifying patients with rare diseases. However, reasoning with numerical values remains challenging for phenotyping in clinical text, for example, temperature 102F representing Fever. Current state-of-the-art phenotyping models are able to detect general phenotypes, but perform poorly when they detect phenotypes requiring numerical reasoning. We present a novel unsupervised methodology leveraging external knowledge and contextualized word embeddings from ClinicalBERT for numerical reasoning in a variety of phenotypic contexts. Comparing against unsupervised benchmarks, it shows a substantial performance improvement with absolute gains on generalized Recall and F1 scores up to 79% and 71%, respectively. In the supervised setting, it also surpasses the performance of alternative approaches with absolute gains on generalized Recall and F1 scores up to 70% and 44%, respectively.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源