论文标题
具有验证嵌入的多语言和多模式主题建模
Multilingual and Multimodal Topic Modelling with Pretrained Embeddings
论文作者
论文摘要
本文提出了M3L-Contrast-一种新颖的多模式多语言(M3L)神经主题模型,用于可比较的数据,该模型将多种语言和图像的文本映射到共享的主题空间中。我们的模型是在文本和图像上共同训练的,并利用验证的文档和图像嵌入来抽象不同语言和方式之间的复杂性。作为一个多语言主题模型,它会产生特定于语言的主题,作为多模式模型,它会渗透图像中语义概念的文本表示。我们证明,我们的模型具有零拍的主题模型的竞争力,可以预测可比较的多语言数据的主题分布,并且在预测可比较的文本和图像的主题分布方面明显超过了零拍模型。我们还表明,我们的模型在非对齐的嵌入方式上的性能与对齐嵌入的嵌入一样。
This paper presents M3L-Contrast -- a novel multimodal multilingual (M3L) neural topic model for comparable data that maps texts from multiple languages and images into a shared topic space. Our model is trained jointly on texts and images and takes advantage of pretrained document and image embeddings to abstract the complexities between different languages and modalities. As a multilingual topic model, it produces aligned language-specific topics and as multimodal model, it infers textual representations of semantic concepts in images. We demonstrate that our model is competitive with a zero-shot topic model in predicting topic distributions for comparable multilingual data and significantly outperforms a zero-shot model in predicting topic distributions for comparable texts and images. We also show that our model performs almost as well on unaligned embeddings as it does on aligned embeddings.