论文标题
Syrapropa在Semeval-2020任务11:基于BERT的宣传技术设计和跨度检测
syrapropa at SemEval-2020 Task 11: BERT-based Models Design For Propagandistic Technique and Span Detection
论文作者
论文摘要
本文介绍了针对Semeval-2020任务中的两个子任务提出的基于BERT的模型11:在新闻文章中检测宣传技术。我们首先基于Spanbert构建跨度识别(SI)的模型,并通过更深的模型和句子级表示促进检测。然后,我们开发了用于技术分类的混合模型(TC)。混合模型由三个子模型组成,包括两个具有不同训练方法的BERT模型和一个基于功能的逻辑回归模型。我们努力通过调整成本功能来处理不平衡的数据集。我们位于SI子任务(F1量的0.4711)中的第七名,在开发集中的TC子任务中的第三名(F1-量化的0.6783)位居第三。
This paper describes the BERT-based models proposed for two subtasks in SemEval-2020 Task 11: Detection of Propaganda Techniques in News Articles. We first build the model for Span Identification (SI) based on SpanBERT, and facilitate the detection by a deeper model and a sentence-level representation. We then develop a hybrid model for the Technique Classification (TC). The hybrid model is composed of three submodels including two BERT models with different training methods, and a feature-based Logistic Regression model. We endeavor to deal with imbalanced dataset by adjusting cost function. We are in the seventh place in SI subtask (0.4711 of F1-measure), and in the third place in TC subtask (0.6783 of F1-measure) on the development set.