论文标题
使用来自多个筒仓的临床笔记对BERT进行了联合预处理和微调
Federated pretraining and fine tuning of BERT using clinical notes from multiple silos
论文作者
论文摘要
大规模上下文表示模型(例如BERT)在最近几年中具有显着高级的自然语言处理(NLP)。但是,在某些领域,例如医疗保健,由于隐私和监管原因,从多个机构访问多个机构的大规模文本数据非常具有挑战性。在本文中,我们表明,可以使用来自不同筒仓的临床文本以联合方式预处理和微调BERT模型,而无需移动数据。
Large scale contextual representation models, such as BERT, have significantly advanced natural language processing (NLP) in recently years. However, in certain area like healthcare, accessing diverse large scale text data from multiple institutions is extremely challenging due to privacy and regulatory reasons. In this article, we show that it is possible to both pretrain and fine tune BERT models in a federated manner using clinical texts from different silos without moving the data.