论文标题
Arch-Graph:可转移的神经体系结构搜索的无环体系结构关系预测指标
Arch-Graph: Acyclic Architecture Relation Predictor for Task-Transferable Neural Architecture Search
论文作者
论文摘要
神经体系结构搜索(NAS)旨在寻找多个任务的有效模型。除了寻求解决方案以完成一项任务外,还激发了人们对跨多个任务传输网络设计知识的兴趣。在这一研究中,有效地对任务相关性进行建模至关重要,但高度忽略了。因此,我们提出了\ textbf {Arch-Graph},这是一种可转移的NAS方法,可预测特定于任务的最佳体系结构相对于给定的任务嵌入。它通过将其嵌入作为预测变量输入的一部分来利用多个任务的相关性,以进行快速适应。我们还将NAS作为体系结构关系图预测问题提出,并通过将候选体系结构视为节点及其成对关系作为边缘来构建的关系图。为了实施一些基本属性,例如关系图中的无环,我们为优化过程添加了其他约束,将NAS转换为查找最大加权无循环子图(MWAS)的问题。然后,我们的算法努力消除周期,并且只有可以信任等级结果,则仅在图中建立边缘。通过MWAS,Arch-Graph可以有效地为每个任务的候选模型排名,只有少量预算就可以预测预测因子。通过对TransNAS Bench-101进行的广泛实验,我们在许多任务中显示了大拱门的可传递性和高样本效率,击败了为单任务和多任务搜索设计的许多NAS方法。它能够在仅50个型号的预算下找到两个搜索空间的最高0.16 \%和0.29 \%体系结构。
Neural Architecture Search (NAS) aims to find efficient models for multiple tasks. Beyond seeking solutions for a single task, there are surging interests in transferring network design knowledge across multiple tasks. In this line of research, effectively modeling task correlations is vital yet highly neglected. Therefore, we propose \textbf{Arch-Graph}, a transferable NAS method that predicts task-specific optimal architectures with respect to given task embeddings. It leverages correlations across multiple tasks by using their embeddings as a part of the predictor's input for fast adaptation. We also formulate NAS as an architecture relation graph prediction problem, with the relational graph constructed by treating candidate architectures as nodes and their pairwise relations as edges. To enforce some basic properties such as acyclicity in the relational graph, we add additional constraints to the optimization process, converting NAS into the problem of finding a Maximal Weighted Acyclic Subgraph (MWAS). Our algorithm then strives to eliminate cycles and only establish edges in the graph if the rank results can be trusted. Through MWAS, Arch-Graph can effectively rank candidate models for each task with only a small budget to finetune the predictor. With extensive experiments on TransNAS-Bench-101, we show Arch-Graph's transferability and high sample efficiency across numerous tasks, beating many NAS methods designed for both single-task and multi-task search. It is able to find top 0.16\% and 0.29\% architectures on average on two search spaces under the budget of only 50 models.