论文标题
超越不变性:具有“虚假”相关性的分布的测试时间标签移动适应
Beyond Invariance: Test-Time Label-Shift Adaptation for Distributions with "Spurious" Correlations
论文作者
论文摘要
测试时间的数据分布的变化可能会对预测模型$ p(y | x)$的性能产生有害影响。我们考虑有其他元数据标签(例如组标签)的情况,该标签用$ z $表示,可以说明分销中的这种变化。特别是,我们假设先前的分布$ p(y,z)$,它模拟了类标签$ y $与“滋扰”因子$ z $之间的依赖性,可能是由于这些术语之间的相关性发生变化,或者是其边际上的一个相关性的变化。但是,我们假设特征$ p(x | y,z)$的生成模型在范围内不变。我们注意到,这对应于广泛使用的“标签移位”假设的扩展版本,其中标签现在还包括滋扰因子$ z $。基于此观察结果,我们提出了一种测试时间标签换档校正,该校正适应了联合分布的变化$ p(y,z)$,使用EM应用于目标域分布的未标记样品,$ p_t(x)$。重要的是,我们能够避免安装生成型$ p(x | y,z)$,并且只需要重新重量训练了对源分布训练的判别模型$ p_s(y,z | x)$的输出。我们在几个标准图像和文本数据集以及Chexpert胸部X射线数据集上评估了我们称为“测试时间标签换挡改编”(TTLSA)(TTLSA)的方法,并表明它可以提高靶向不变性的性能,该方法对分布的变化,以及基线经验的经验风险最小化方法。可用于复制实验的代码可在https://github.com/nalzok/test time-label-shift上找到。
Changes in the data distribution at test time can have deleterious effects on the performance of predictive models $p(y|x)$. We consider situations where there are additional meta-data labels (such as group labels), denoted by $z$, that can account for such changes in the distribution. In particular, we assume that the prior distribution $p(y, z)$, which models the dependence between the class label $y$ and the "nuisance" factors $z$, may change across domains, either due to a change in the correlation between these terms, or a change in one of their marginals. However, we assume that the generative model for features $p(x|y,z)$ is invariant across domains. We note that this corresponds to an expanded version of the widely used "label shift" assumption, where the labels now also include the nuisance factors $z$. Based on this observation, we propose a test-time label shift correction that adapts to changes in the joint distribution $p(y, z)$ using EM applied to unlabeled samples from the target domain distribution, $p_t(x)$. Importantly, we are able to avoid fitting a generative model $p(x|y, z)$, and merely need to reweight the outputs of a discriminative model $p_s(y, z|x)$ trained on the source distribution. We evaluate our method, which we call "Test-Time Label-Shift Adaptation" (TTLSA), on several standard image and text datasets, as well as the CheXpert chest X-ray dataset, and show that it improves performance over methods that target invariance to changes in the distribution, as well as baseline empirical risk minimization methods. Code for reproducing experiments is available at https://github.com/nalzok/test-time-label-shift .