重新考虑在分配转移下进行深度学习的重要性加权

论文标题

重新考虑在分配转移下进行深度学习的重要性加权

Rethinking Importance Weighting for Deep Learning under Distribution Shift

论文作者

Fang, Tongtong, Lu, Nan, Niu, Gang, Sugiyama, Masashi

论文摘要

在分布偏移（DS）下，训练数据分布与测试1的不同之处，强大的技术是重要的加权（IW），该技术在两个单独的步骤中处理DS：权重估计（WE）估计跨训练密度比和加权分类（WC）训练分类器从加权训练数据中进行分类器。但是，由于我们与深度学习不兼容，因此IW无法很好地工作。在本文中，我们重新考虑IW，理论上显示它具有循环依赖性：我们不仅需要WC，而且还需要WC，因为我们在其中使用了训练有素的深层分类器作为特征提取器（FE）。为了切断依赖性，我们试图从未加权的培训数据中预先限制FE，这导致FE有偏见。为了克服偏见，我们提出了一个端到端的解决方案动态IW，该动态IW在我们和WC之间进行了迭代，并以无缝的方式将它们结合在一起，因此我们也可以间接享受深层网络和随机优化器。在三个流行数据集上使用两种代表性DS的实验表明，我们的动态IW与最新方法相比有利。

Under distribution shift (DS) where the training data distribution differs from the test one, a powerful technique is importance weighting (IW) which handles DS in two separate steps: weight estimation (WE) estimates the test-over-training density ratio and weighted classification (WC) trains the classifier from weighted training data. However, IW cannot work well on complex data, since WE is incompatible with deep learning. In this paper, we rethink IW and theoretically show it suffers from a circular dependency: we need not only WE for WC, but also WC for WE where a trained deep classifier is used as the feature extractor (FE). To cut off the dependency, we try to pretrain FE from unweighted training data, which leads to biased FE. To overcome the bias, we propose an end-to-end solution dynamic IW that iterates between WE and WC and combines them in a seamless manner, and hence our WE can also enjoy deep networks and stochastic optimizers indirectly. Experiments with two representative types of DS on three popular datasets show that our dynamic IW compares favorably with state-of-the-art methods.

下载PDF全文

下载文献需遵守相关版权规定

论文标题