论文标题
Claclab在Socialdisner:使用医学杂志命名为西班牙推文中提及的疾病识别
CLaCLab at SocialDisNER: Using Medical Gazetteers for Named-Entity Recognition of Disease Mentions in Spanish Tweets
论文作者
论文摘要
本文总结了SMM4H 2022任务10的CLAC提交,该提交涉及西班牙推文中提到的疾病的识别。在对每个令牌进行分类之前,我们使用多语言Roberta大型,UMLS Gazetteer和Distemist Gazetteer等功能与变压器编码器编码每个令牌编码器。我们获得0.869的严格F1得分,竞争平均值为0.675,标准偏差为0.245,中值为0.761。
This paper summarizes the CLaC submission for SMM4H 2022 Task 10 which concerns the recognition of diseases mentioned in Spanish tweets. Before classifying each token, we encode each token with a transformer encoder using features from Multilingual RoBERTa Large, UMLS gazetteer, and DISTEMIST gazetteer, among others. We obtain a strict F1 score of 0.869, with competition mean of 0.675, standard deviation of 0.245, and median of 0.761.