ISTRBOOST：使用提升的重要性采样转移回归

论文标题

ISTRBOOST：使用提升的重要性采样转移回归

ISTRBoost: Importance Sampling Transfer Regression using Boosting

论文作者

Gupta, Shrey, Bi, Jianzhao, Liu, Yang, Wildani, Avani

论文摘要

当前的实例转移学习（ITL）方法论使用域的适应性和子空间转换来实现成功的转移学习。但是，这些方法在其过程中，有时在目标数据集上过度拟合，或者如果测试数据集具有较高的差异，则会遭受负转移。已证明增强方法可降低通过具有高分子的迭代重新提高实例过度拟合的风险。但是，通常通过参数优化实现这种平衡，并减少由于源数据集的大小而产生的权重偏度。尽管可以实现前者，但后者更具挑战性，并可能导致负面转移。我们通过基于流行的增强ITL回归方法（两阶段tradaboost.r2）来介绍一个更简单，更强大的解决方案。我们的方法论〜\ us {}是一种基于随机的集合方法，它利用重要性抽样来减少由于源数据集引起的偏度。我们表明，〜\ us {}〜的表现要比竞争转移学习方法$ 63 \％$。与其他转移学习方法观察到的零星结果相比，它还在不同的数据集上表现出其性能的一致性。

Current Instance Transfer Learning (ITL) methodologies use domain adaptation and sub-space transformation to achieve successful transfer learning. However, these methodologies, in their processes, sometimes overfit on the target dataset or suffer from negative transfer if the test dataset has a high variance. Boosting methodologies have been shown to reduce the risk of overfitting by iteratively re-weighing instances with high-residual. However, this balance is usually achieved with parameter optimization, as well as reducing the skewness in weights produced due to the size of the source dataset. While the former can be achieved, the latter is more challenging and can lead to negative transfer. We introduce a simpler and more robust fix to this problem by building upon the popular boosting ITL regression methodology, two-stage TrAdaBoost.R2. Our methodology,~\us{}, is a boosting and random-forest based ensemble methodology that utilizes importance sampling to reduce the skewness due to the source dataset. We show that~\us{}~performs better than competitive transfer learning methodologies $63\%$ of the time. It also displays consistency in its performance over diverse datasets with varying complexities, as opposed to the sporadic results observed for other transfer learning methodologies.

下载PDF全文

下载文献需遵守相关版权规定

论文标题