论文标题
用于医学表示学习的图形文本多模式预训练
Graph-Text Multi-Modal Pre-training for Medical Representation Learning
论文作者
论文摘要
随着电子健康记录(EHR)的数量急剧增长,人们对学习EHR的医疗保健应用代表的兴趣有所兴趣。 EHR的表示学习需要对EHR中的两种主要模式进行适当的建模:结构化数据和非结构化文本。在本文中,我们提出了MEDGTX,这是一种用于结构化和文本EHR数据的多模式表示学习的预训练模型。 MEDGTX使用新颖的图形编码器来利用结构化EHR数据的图形性质,以及一个文本编码器来处理非结构化文本,以及一个跨模式编码器来学习关节表示空间。我们通过对MIMIC-III(一个开源EHR数据)进行的四个代理任务进行预培训,并在两个临床基准和三个新颖的下游任务上评估我们的模型,这些任务解决了EHR数据中现实世界中的问题。结果始终显示了预训练模型对EHR结构化和非结构化信息的联合表示的有效性。鉴于MEDGTX的有希望的表现,我们认为这项工作为共同了解EHR数据的两种基本方式打开了新的门。
As the volume of Electronic Health Records (EHR) sharply grows, there has been emerging interest in learning the representation of EHR for healthcare applications. Representation learning of EHR requires appropriate modeling of the two dominant modalities in EHR: structured data and unstructured text. In this paper, we present MedGTX, a pre-trained model for multi-modal representation learning of the structured and textual EHR data. MedGTX uses a novel graph encoder to exploit the graphical nature of structured EHR data, and a text encoder to handle unstructured text, and a cross-modal encoder to learn a joint representation space. We pre-train our model through four proxy tasks on MIMIC-III, an open-source EHR data, and evaluate our model on two clinical benchmarks and three novel downstream tasks which tackle real-world problems in EHR data. The results consistently show the effectiveness of pre-training the model for joint representation of both structured and unstructured information from EHR. Given the promising performance of MedGTX, we believe this work opens a new door to jointly understanding the two fundamental modalities of EHR data.