论文标题
关于选择用于改进序列标签的辅助语言
On the Choice of Auxiliary Languages for Improved Sequence Tagging
论文作者
论文摘要
最近的工作表明,相关语言的嵌入可以提高序列标记的性能,即使对于单语模型。在本分析论文中,我们研究了是否可以根据语言距离预测最佳的辅助语言,并表明最相关的语言并不总是最好的辅助语言。此外,我们表明,基于注意力的荟萃限制可以有效地结合不同语言的预训练的嵌入,以进行序列标记,并为五种语言的词性标记设置新的最新结果。
Recent work showed that embeddings from related languages can improve the performance of sequence tagging, even for monolingual models. In this analysis paper, we investigate whether the best auxiliary language can be predicted based on language distances and show that the most related language is not always the best auxiliary language. Further, we show that attention-based meta-embeddings can effectively combine pre-trained embeddings from different languages for sequence tagging and set new state-of-the-art results for part-of-speech tagging in five languages.