分层张量分解和深度卷积神经网络中的隐式正则化

论文标题

分层张量分解和深度卷积神经网络中的隐式正则化

Implicit Regularization in Hierarchical Tensor Factorization and Deep Convolutional Neural Networks

论文作者

Razin, Noam, Maman, Asaf, Cohen, Nadav

论文摘要

为了在深度学习中解释隐式正则化时，给予了矩阵和张量因子化的突出重点，这与简化的神经网络相对应。结果表明，这些模型分别表现出对低基质和张量排名的隐式趋势。当前的论文理论上逐渐了解实用深度学习，从理论上分析了分层张量分解中的隐式正则化，该模型与某些深卷积神经网络相当。通过动态系统镜头，我们克服了与层次结构相关的挑战，并建立了对低层次张量级别的隐式正则化。这转化为对关联的卷积网络对区域的隐式正规化。受我们的理论的启发，我们设计了明确的正则化，阻止了区域性，并证明了其在需要进行建筑变化的传统智慧的情况下，可以提高现代卷积网络在非本地任务上的性能。我们的工作突出了通过对其隐式正则化的理论分析来增强神经网络的潜力。

In the pursuit of explaining implicit regularization in deep learning, prominent focus was given to matrix and tensor factorizations, which correspond to simplified neural networks. It was shown that these models exhibit an implicit tendency towards low matrix and tensor ranks, respectively. Drawing closer to practical deep learning, the current paper theoretically analyzes the implicit regularization in hierarchical tensor factorization, a model equivalent to certain deep convolutional neural networks. Through a dynamical systems lens, we overcome challenges associated with hierarchy, and establish implicit regularization towards low hierarchical tensor rank. This translates to an implicit regularization towards locality for the associated convolutional networks. Inspired by our theory, we design explicit regularization discouraging locality, and demonstrate its ability to improve the performance of modern convolutional networks on non-local tasks, in defiance of conventional wisdom by which architectural changes are needed. Our work highlights the potential of enhancing neural networks via theoretical analysis of their implicit regularization.

下载PDF全文

下载文献需遵守相关版权规定

论文标题