论文标题
分析单词嵌入稳定性跨语言的惊人变异性
Analyzing the Surprising Variability in Word Embedding Stability Across Languages
论文作者
论文摘要
单词嵌入是强大的表示形式,构成了英语和其他语言的许多自然语言处理体系结构的基础。为了进一步了解单词嵌入,我们探索了它们的稳定性(例如,在不同嵌入空间中一个单词的最近邻居之间重叠)。我们讨论与稳定性相关的语言属性,引发有关与粘附,语言性别系统和其他特征相关的见解。这对嵌入使用有影响,特别是在使用它们来研究语言趋势的研究中。
Word embeddings are powerful representations that form the foundation of many natural language processing architectures, both in English and in other languages. To gain further insight into word embeddings, we explore their stability (e.g., overlap between the nearest neighbors of a word in different embedding spaces) in diverse languages. We discuss linguistic properties that are related to stability, drawing out insights about correlations with affixing, language gender systems, and other features. This has implications for embedding use, particularly in research that uses them to study language trends.