论文标题

从双流式数据的在线深度学习

Online Deep Learning from Doubly-Streaming Data

论文作者

Lian, Heng, Atwood, John Scovil, Hou, Bojian, Wu, Jian, He, Yi

论文摘要

本文研究了一个新的在线学习问题,其中包含双流式数据,其中数据流是通过不断发展的特征空间描述的,新的功能逐渐消失,旧功能逐渐消失。这个问题的挑战是两个折叠:1)随着时间的推移,数据样本不断流动可能会随着时间的流逝而随着时间的流逝而随着时间的流逝而进行移动,因此需要学习者进行更新,因此可以适应。 2)很少的样本描述了新出现的特征,从而导致较弱的学习者倾向于做出错误预测。克服挑战的一个合理的想法是建立前进的特征空间之间的关系,以便在线学习者可以利用从旧功能中学到的知识来改善新功能的学习性能。不幸的是,这个想法并没有扩展到具有复杂功能相互作用的高维媒体流,这在善良(偏见的浅学习者)和表现力(需要深度学习者)之间的权衡受到了权衡。在此激励的情况下,我们提出了一种新颖的旧^3S范式,其中发现了一个共享的潜在子空间来总结旧功能空间中的信息,从而构建了中间功能映射关系。旧^3S的关键特征是将模型容量视为可学习的语义,根据在线方式以输入数据流的复杂性和非线性,共同产生最佳模型深度和参数。理论分析和实证研究都证实了我们提议的生存能力和有效性。

This paper investigates a new online learning problem with doubly-streaming data, where the data streams are described by feature spaces that constantly evolve, with new features emerging and old features fading away. The challenges of this problem are two folds: 1) Data samples ceaselessly flowing in may carry shifted patterns over time, requiring learners to update hence adapt on-the-fly. 2) Newly emerging features are described by very few samples, resulting in weak learners that tend to make error predictions. A plausible idea to overcome the challenges is to establish relationship between the pre-and-post evolving feature spaces, so that an online learner can leverage the knowledge learned from the old features to better the learning performance on the new features. Unfortunately, this idea does not scale up to high-dimensional media streams with complex feature interplay, which suffers an tradeoff between onlineness (biasing shallow learners) and expressiveness(requiring deep learners). Motivated by this, we propose a novel OLD^3S paradigm, where a shared latent subspace is discovered to summarize information from the old and new feature spaces, building intermediate feature mapping relationship. A key trait of OLD^3S is to treat the model capacity as a learnable semantics, yields optimal model depth and parameters jointly, in accordance with the complexity and non-linearity of the input data streams in an online fashion. Both theoretical analyses and empirical studies substantiate the viability and effectiveness of our proposal.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源