论文标题
社会科学的变压器编码器
Transformer Encoder for Social Science
论文作者
论文摘要
高质量的文本数据已成为社会科学家的重要数据源。在最近的社会科学研究中,我们目睹了预处理的深神经网络模型的成功。在本文中,我们提出了一个紧凑的深度神经网络,《社会科学的变压器编码器》(TESS),明确设计,旨在解决社会科学研究中的文本处理任务。使用两项验证测试,我们证明,当训练样本的数量有限(<1,000个培训实例)时,Tess的表现平均比Bert和Roberta平均比Roberta平均16.7%。结果表明,苔丝比伯特和罗伯塔在社会科学文本处理任务上的优越性。最后,我们讨论了我们的模型的局限性,并向未来的研究人员提供建议。
High-quality text data has become an important data source for social scientists. We have witnessed the success of pretrained deep neural network models, such as BERT and RoBERTa, in recent social science research. In this paper, we propose a compact pretrained deep neural network, Transformer Encoder for Social Science (TESS), explicitly designed to tackle text processing tasks in social science research. Using two validation tests, we demonstrate that TESS outperforms BERT and RoBERTa by 16.7% on average when the number of training samples is limited (<1,000 training instances). The results display the superiority of TESS over BERT and RoBERTa on social science text processing tasks. Lastly, we discuss the limitation of our model and present advice for future researchers.