可转移性指导的跨域交叉任务转移学习

论文标题

可转移性指导的跨域交叉任务转移学习

Transferability-Guided Cross-Domain Cross-Task Transfer Learning

论文作者

Tan, Yang, Zhang, Enming, Li, Yang, Huang, Shao-Lun, Zhang, Xiao-Ping

论文摘要

我们提出了两个新颖的可传递性指标F-OTCE（基于快速最佳传输的条件熵）和JC-otce（联合通信OTCE），以评估源模型（任务）可以使目标任务学习的程度有益于学习目标，并学习更多可转移表示表示的交叉盘交叉任务转移学习。与需要评估辅助任务的经验可转让性的现有指标不同，我们的指标是无辅助的，以便可以更有效地计算它们。具体而言，F-otce通过首先求解源和目标分布之间的最佳传输（OT）问题来估算可转移性，然后使用最佳耦合来计算源和目标标签之间的负条件熵。它也可以用作损失函数，以最大化目标任务填充源模型的可传递性。同时，JC-OTCE通过在OT问题中包含标签距离来提高F-otce的可转移性鲁棒性，尽管它可能会产生额外的计算成本。广泛的实验表明，F-otce和JC-otce优于最先进的无辅助指标，分别为18.85％和28.88％，与基础真相转移精度相关系数。通过消除辅助任务的训练成本，两个指标将上一项方法的总计算时间从43分钟减少到9.32和10.78，用于一对任务。当用作损失函数时，F-OTCE在几次分类实验中显示出源模型传递精度的一致性提高，精度增益高达4.41％。

We propose two novel transferability metrics F-OTCE (Fast Optimal Transport based Conditional Entropy) and JC-OTCE (Joint Correspondence OTCE) to evaluate how much the source model (task) can benefit the learning of the target task and to learn more transferable representations for cross-domain cross-task transfer learning. Unlike the existing metric that requires evaluating the empirical transferability on auxiliary tasks, our metrics are auxiliary-free such that they can be computed much more efficiently. Specifically, F-OTCE estimates transferability by first solving an Optimal Transport (OT) problem between source and target distributions, and then uses the optimal coupling to compute the Negative Conditional Entropy between source and target labels. It can also serve as a loss function to maximize the transferability of the source model before finetuning on the target task. Meanwhile, JC-OTCE improves the transferability robustness of F-OTCE by including label distances in the OT problem, though it may incur additional computation cost. Extensive experiments demonstrate that F-OTCE and JC-OTCE outperform state-of-the-art auxiliary-free metrics by 18.85% and 28.88%, respectively in correlation coefficient with the ground-truth transfer accuracy. By eliminating the training cost of auxiliary tasks, the two metrics reduces the total computation time of the previous method from 43 minutes to 9.32s and 10.78s, respectively, for a pair of tasks. When used as a loss function, F-OTCE shows consistent improvements on the transfer accuracy of the source model in few-shot classification experiments, with up to 4.41% accuracy gain.

下载PDF全文

下载文献需遵守相关版权规定

论文标题