通过随机梯度下降通过神经进化来优化深层神经网络

论文标题

通过随机梯度下降通过神经进化来优化深层神经网络

Optimizing Deep Neural Networks through Neuroevolution with Stochastic Gradient Descent

论文作者

Zhang, Haichao, Hao, Kuangrong, Gao, Lei, Wei, Bing, Tang, Xuesong

论文摘要

深度神经网络（DNN）在计算机视觉方面取得了巨大的成功；然而，培训DNN的令人满意的表现仍然具有挑战性，并且对培训优化算法的经验选择的敏感性遭受了敏感性。随机梯度下降（SGD）通过调节神经网络权重训练DNN占主导地位，以最大程度地减少DNN损失函数。作为另一种方法，神经进化更符合进化过程，并提供了一些关键功能，这些功能通常在SGD中不可用，例如基于神经发展中个人协作的启发式黑盒搜索策略。本文提出了一种新颖的方法，该方法结合了神经进化和SGD的优点，可以实现进化搜索，平行探索以及最佳DNN的有效探针。还开发了一种基于分层的群集抑制算法，以克服个体之间的类似体重更新，以改善人口多样性。我们基于四个公共可用数据集中的四个代表性DNN中实施了建议的方法。实验结果表明，通过所提出的方法优化的四个DNN均优于所有数据集中仅由SGD优化的所有表现相应的方法。提出的方法优化的DNN的性能也优于最先进的深层网络。这项工作还提出了追求人工通用情报的有意义的尝试。

Deep neural networks (DNNs) have achieved remarkable success in computer vision; however, training DNNs for satisfactory performance remains challenging and suffers from sensitivity to empirical selections of an optimization algorithm for training. Stochastic gradient descent (SGD) is dominant in training a DNN by adjusting neural network weights to minimize the DNNs loss function. As an alternative approach, neuroevolution is more in line with an evolutionary process and provides some key capabilities that are often unavailable in SGD, such as the heuristic black-box search strategy based on individual collaboration in neuroevolution. This paper proposes a novel approach that combines the merits of both neuroevolution and SGD, enabling evolutionary search, parallel exploration, and an effective probe for optimal DNNs. A hierarchical cluster-based suppression algorithm is also developed to overcome similar weight updates among individuals for improving population diversity. We implement the proposed approach in four representative DNNs based on four publicly-available datasets. Experiment results demonstrate that the four DNNs optimized by the proposed approach all outperform corresponding ones optimized by only SGD on all datasets. The performance of DNNs optimized by the proposed approach also outperforms state-of-the-art deep networks. This work also presents a meaningful attempt for pursuing artificial general intelligence.

下载PDF全文

下载文献需遵守相关版权规定

论文标题