论文标题
深神经内核机
Deep Neural-Kernel Machines
论文作者
论文摘要
在本章中,我们回顾了与深度神经内核体系结构的最新进步有关的主要文献,这种方法寻求两个强大的模型之间的协同作用,即基于内核的模型和人工神经网络。引入的深度神经内核框架由神经网络建筑和内核机的杂交组成。更确切地说,对于内核对应物,该模型基于最小二乘支持矢量机,并具有明确的功能映射。在这里,我们讨论了通过随机傅立叶特征获得的一种明确特征图的使用。借助此明确的特征映射,一方面桥接这两个体系结构变得更加简单,另一方面,可以在原始内容中找到相关优化问题的解决方案,因此使模型可扩展到大型数据集。首先,我们引入了一种神经内核体系结构,该结构是配备不同合并层的更深层模型的核心模块。特别是,我们回顾了三台具有平均,麦克斯和卷积池层的神经内核机。在平均合并层中,平均以前表示层的输出。 Maxout层触发不同输入表示之间的竞争,并允许在同一模型中形成多个子网络。卷积池层降低了多尺度输出表示的维度。与神经内核模型,基于内核的模型和经典神经网络体系结构进行了比较,数值实验说明了在多个基准数据集上引入模型的有效性。
In this chapter we review the main literature related to the recent advancement of deep neural-kernel architecture, an approach that seek the synergy between two powerful class of models, i.e. kernel-based models and artificial neural networks. The introduced deep neural-kernel framework is composed of a hybridization of the neural networks architecture and a kernel machine. More precisely, for the kernel counterpart the model is based on Least Squares Support Vector Machines with explicit feature mapping. Here we discuss the use of one form of an explicit feature map obtained by random Fourier features. Thanks to this explicit feature map, in one hand bridging the two architectures has become more straightforward and on the other hand one can find the solution of the associated optimization problem in the primal, therefore making the model scalable to large scale datasets. We begin by introducing a neural-kernel architecture that serves as the core module for deeper models equipped with different pooling layers. In particular, we review three neural-kernel machines with average, maxout and convolutional pooling layers. In average pooling layer the outputs of the previous representation layers are averaged. The maxout layer triggers competition among different input representations and allows the formation of multiple sub-networks within the same model. The convolutional pooling layer reduces the dimensionality of the multi-scale output representations. Comparison with neural-kernel model, kernel based models and the classical neural networks architecture have been made and the numerical experiments illustrate the effectiveness of the introduced models on several benchmark datasets.