流星：从多模式数据流学习记忆和时间效率表示

论文标题

流星：从多模式数据流学习记忆和时间效率表示

METEOR: Learning Memory and Time Efficient Representations from Multi-modal Data Streams

论文作者

Silva, Amila, Karunasekera, Shanika, Leckie, Christopher, Luo, Ling

论文摘要

许多学习任务涉及多模式数据流，其中来自不同模式的连续数据传达了有关对象的全面描述。在这种情况下，一个主要的挑战是如何在复杂环境中有效解释多模式信息。这促使许多研究从多模式数据流中学习无监督的表示。这些研究旨在通过共同学习以不同方式（例如，Twitter消息的文本，用户和位置）共同学习针对低级语义单元的嵌入来了解更高级别的上下文信息（例如，Twitter消息）。但是，这些方法将每个低级语义单元与连续嵌入向量联系起来，从而导致高内存需求。因此，在低内存设备（例如，移动设备）中部署和连续学习此类模型成为一个问题。为了解决这个问题，我们提出了流星，这是一种新颖的记忆和时间效率的在线表示学习技术，该技术：（1）通过在语义上有意义的群体中共享参数并保留域 - 不可思议的语义来学习多模式数据的紧凑表示；（2）可以使用并行过程加速以适应不同的流率，同时捕获单位的时间变化；（3）可以轻松扩展以捕获与多模式数据流有关的隐式/显式外部知识。我们使用两种类型的多模式数据流（即社交媒体流和购物交易流）评估流星，以证明其适应不同域的能力。我们的结果表明，与传统的记忆密集型嵌入相比，流星保留了表示形式的质量，同时将记忆使用量减少了约80％。

Many learning tasks involve multi-modal data streams, where continuous data from different modes convey a comprehensive description about objects. A major challenge in this context is how to efficiently interpret multi-modal information in complex environments. This has motivated numerous studies on learning unsupervised representations from multi-modal data streams. These studies aim to understand higher-level contextual information (e.g., a Twitter message) by jointly learning embeddings for the lower-level semantic units in different modalities (e.g., text, user, and location of a Twitter message). However, these methods directly associate each low-level semantic unit with a continuous embedding vector, which results in high memory requirements. Hence, deploying and continuously learning such models in low-memory devices (e.g., mobile devices) becomes a problem. To address this problem, we present METEOR, a novel MEmory and Time Efficient Online Representation learning technique, which: (1) learns compact representations for multi-modal data by sharing parameters within semantically meaningful groups and preserves the domain-agnostic semantics; (2) can be accelerated using parallel processes to accommodate different stream rates while capturing the temporal changes of the units; and (3) can be easily extended to capture implicit/explicit external knowledge related to multi-modal data streams. We evaluate METEOR using two types of multi-modal data streams (i.e., social media streams and shopping transaction streams) to demonstrate its ability to adapt to different domains. Our results show that METEOR preserves the quality of the representations while reducing memory usage by around 80% compared to the conventional memory-intensive embeddings.

下载PDF全文

下载文献需遵守相关版权规定

论文标题