论文标题
异步分散的贝叶斯优化大规模高参数优化
Asynchronous Decentralized Bayesian Optimization for Large Scale Hyperparameter Optimization
论文作者
论文摘要
贝叶斯优化(BO)是对深神经网络(DNNS)优化高参数优化的有前途的方法,每个模型训练可能需要几分钟到几个小时。在BO中,采用计算廉价的替代模型来了解参数配置与其性能(例如准确性)之间的关系。并行BO方法通常采用单个经理/多个工人策略来同时评估多个超参数配置。尽管评估时间很高,但此类集中计划的间接费用仍阻止了这些方法扩展大量工人。我们提出了一个异步 - 分类的BO,其中每个工人运行一个顺序的BO,并且异步通过共享存储来传达其结果。我们扩展了我们的方法而不会损失计算效率,超过95%的工人利用率到1,920名平行工人(Polaris SuperComputer的全部生产队列),并显示了模型准确性的提高,并且从Exascale Computing项目中对蜡烛基准的收敛速度更快。
Bayesian optimization (BO) is a promising approach for hyperparameter optimization of deep neural networks (DNNs), where each model training can take minutes to hours. In BO, a computationally cheap surrogate model is employed to learn the relationship between parameter configurations and their performance such as accuracy. Parallel BO methods often adopt single manager/multiple workers strategies to evaluate multiple hyperparameter configurations simultaneously. Despite significant hyperparameter evaluation time, the overhead in such centralized schemes prevents these methods to scale on a large number of workers. We present an asynchronous-decentralized BO, wherein each worker runs a sequential BO and asynchronously communicates its results through shared storage. We scale our method without loss of computational efficiency with above 95% of worker's utilization to 1,920 parallel workers (full production queue of the Polaris supercomputer) and demonstrate improvement in model accuracy as well as faster convergence on the CANDLE benchmark from the Exascale computing project.