评估文本嵌入的结构有效性以及用于调查问题的应用

论文标题

评估文本嵌入的结构有效性以及用于调查问题的应用

Evaluating the Construct Validity of Text Embeddings with Application to Survey Questions

论文作者

Fang, Qixiang, Nguyen, Dong, Oberski, Daniel L

论文摘要

来自自然语言处理的文本嵌入模型可以将文本数据（例如单词，句子，文档）映射到所谓的有意义的数值表示（又称文本嵌入）。尽管这种模型越来越多地应用于社会科学研究中，但通常没有解决一个重要的问题：这些嵌入是与社会科学研究相关的结构的有效表示的程度。因此，我们建议使用经典的构造有效性框架来评估文本嵌入的有效性。我们展示了该框架如何适应文本嵌入的不透明和高维质，并适用于调查问题。我们在我们的构造有效性分析中包括几种流行的文本嵌入方法（例如FastText，Glove，Bert，Sonion-Bert，Universal句子编码器）。在某些情况下，我们发现有效性和判别有效性的证据。我们还表明，嵌入可以用于预测受访者对全新调查问题的答案。此外，基于BERT的嵌入技术和通用句子编码器比其他句子提供了调查问题的更有效表示。因此，我们的结果强调了在将其部署在社会科学研究中之前检查文本嵌入的构造有效性的必要性。

Text embedding models from Natural Language Processing can map text data (e.g. words, sentences, documents) to supposedly meaningful numerical representations (a.k.a. text embeddings). While such models are increasingly applied in social science research, one important issue is often not addressed: the extent to which these embeddings are valid representations of constructs relevant for social science research. We therefore propose the use of the classic construct validity framework to evaluate the validity of text embeddings. We show how this framework can be adapted to the opaque and high-dimensional nature of text embeddings, with application to survey questions. We include several popular text embedding methods (e.g. fastText, GloVe, BERT, Sentence-BERT, Universal Sentence Encoder) in our construct validity analyses. We find evidence of convergent and discriminant validity in some cases. We also show that embeddings can be used to predict respondent's answers to completely new survey questions. Furthermore, BERT-based embedding techniques and the Universal Sentence Encoder provide more valid representations of survey questions than do others. Our results thus highlight the necessity to examine the construct validity of text embeddings before deploying them in social science research.

下载PDF全文

下载文献需遵守相关版权规定

论文标题