为深神经网络选择和组成学习率政策

论文标题

为深神经网络选择和组成学习率政策

Selecting and Composing Learning Rate Policies for Deep Neural Networks

论文作者

Wu, Yanzhao, Liu, Ling

论文摘要

学习率（LR）功能和政策的选择已从简单的固定LR演变为衰减的LR和环状LR，旨在提高准确性并减少深神经网络（DNNS）的训练时间。本文提出了一种系统的方法，用于选择和编写LR政策，以实现有效的DNN培训，以满足所需的目标准确性并减少预定义的培训迭代中的训练时间。它做出了三个原始贡献。首先，我们开发了一种LR调整机制，用于在预定义的训练时间限制下自动验证给定的LR策略。其次，我们开发了一个LR策略建议系统（LRBENCH），以通过动态调整从相同和/或不同的LR功能中选择并构成良好的LR策略，并避免选择不良选择，用于给定的学习任务，DNN模型和数据集。第三，我们通过支持不同的DNN优化器来扩展LRBENCH，并显示不同LR策略和不同优化器的显着相互影响。使用流行的基准数据集和不同的DNN模型（LENET，CNN3，RESNET）进行了评估，我们表明我们的方法可以有效地提供高DNN测试的准确性，优于现有的建议默认LR策略，并将DNN培训时间减少1.6 $ \ sim $ \ sim $ 6.7 $ 6.7 $ \ \ \ sim $ 6.7 $ \ \ \ \ \ \ \ sim $ 6.7 $ \ \ \ \ \ \ \ \ sim $ 6.7 $ \ \ \ sim $ \ \ \ sim $ 6.7 $ \ \ sim $ 6.7 $ \ fimes $。

The choice of learning rate (LR) functions and policies has evolved from a simple fixed LR to the decaying LR and the cyclic LR, aiming to improve the accuracy and reduce the training time of Deep Neural Networks (DNNs). This paper presents a systematic approach to selecting and composing an LR policy for effective DNN training to meet desired target accuracy and reduce training time within the pre-defined training iterations. It makes three original contributions. First, we develop an LR tuning mechanism for auto-verification of a given LR policy with respect to the desired accuracy goal under the pre-defined training time constraint. Second, we develop an LR policy recommendation system (LRBench) to select and compose good LR policies from the same and/or different LR functions through dynamic tuning, and avoid bad choices, for a given learning task, DNN model and dataset. Third, we extend LRBench by supporting different DNN optimizers and show the significant mutual impact of different LR policies and different optimizers. Evaluated using popular benchmark datasets and different DNN models (LeNet, CNN3, ResNet), we show that our approach can effectively deliver high DNN test accuracy, outperform the existing recommended default LR policies, and reduce the DNN training time by 1.6$\sim$6.7$\times$ to meet a targeted model accuracy.

下载PDF全文

下载文献需遵守相关版权规定

论文标题