论文标题

部分可观测时空混沌系统的无模型预测

CLOSE: Curriculum Learning On the Sharing Extent Towards Better One-shot NAS

论文作者

Zhou, Zixuan, Ning, Xuefei, Cai, Yi, Han, Jiashu, Deng, Yiping, Dong, Yuhan, Yang, Huazhong, Wang, Yu

论文摘要

由于其效率,一声神经架构搜索(NAS)已被广泛用于发现架构。但是,先前的研究表明,由于架构之间的操作参数过多(即大共享范围),架构的一声绩效估计可能与他们在独立培训中的性能不太息息相关。因此,最近的方法构建了更高参数化的超级链,以降低共享程度。但是这些改进的方法引入了大量额外的参数,因此在培训成本和排名质量之间导致不良的权衡。为了减轻上述问题,我们建议将课程学习应用于共享范围(接近),以有效且有效地训练超级网。具体而言,我们在开始时以很大的共享范围(简单的课程)训练超网,并逐渐降低了超级网的共享程度(更难的课程)。为了支持这种培训策略,我们设计了一种新颖的超级网(CLESENET),该超级网(ClosEnet)将参数从操作中解除,以实现灵活的共享方案和可调节的共享范围。广泛的实验表明,与其他一声超级网络相比,Close可以在不同的计算预算限制中获得更好的排名质量,并且在与各种搜索策略结合使用时能够发现出色的体系结构。代码可从https://github.com/walkerning/aw_nas获得。

One-shot Neural Architecture Search (NAS) has been widely used to discover architectures due to its efficiency. However, previous studies reveal that one-shot performance estimations of architectures might not be well correlated with their performances in stand-alone training because of the excessive sharing of operation parameters (i.e., large sharing extent) between architectures. Thus, recent methods construct even more over-parameterized supernets to reduce the sharing extent. But these improved methods introduce a large number of extra parameters and thus cause an undesirable trade-off between the training costs and the ranking quality. To alleviate the above issues, we propose to apply Curriculum Learning On Sharing Extent (CLOSE) to train the supernet both efficiently and effectively. Specifically, we train the supernet with a large sharing extent (an easier curriculum) at the beginning and gradually decrease the sharing extent of the supernet (a harder curriculum). To support this training strategy, we design a novel supernet (CLOSENet) that decouples the parameters from operations to realize a flexible sharing scheme and adjustable sharing extent. Extensive experiments demonstrate that CLOSE can obtain a better ranking quality across different computational budget constraints than other one-shot supernets, and is able to discover superior architectures when combined with various search strategies. Code is available at https://github.com/walkerning/aw_nas.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源