minimax主动学习

论文标题

minimax主动学习

Minimax Active Learning

论文作者

Ebrahimi, Sayna, Gan, William, Chen, Dian, Biamby, Giscard, Salahi, Kamyar, Laielli, Michael, Zhu, Shizhan, Darrell, Trevor

论文摘要

主动学习旨在通过查询人类注释者标记的最具代表性样本来开发标签效率的算法。当前的主动学习技术要么依赖模型不确定性来选择最不确定的样本，要么使用聚类或重建来选择最多样化的未标记示例。尽管基于不确定性的策略容易受到异常值的影响，但仅依靠样本多样性并不能捕获主要任务上可用的信息。在这项工作中，我们开发了一种基于半监督的最小熵的主动学习算法，该算法以对抗性方式利用不确定性和多样性。我们的模型包括一个熵最小化的特征编码网络，然后是熵最大化分类层。该最小值公式减少了标记/未标记数据之间的分布差距，同时训练歧视器以区分标记/未标记的数据。从分类器中的最高熵样品选择为未标记的标签预测的标签。我们评估了有关各种图像分类和语义分割基准数据集的方法，并在最新方法上显示出卓越的性能。

Active learning aims to develop label-efficient algorithms by querying the most representative samples to be labeled by a human annotator. Current active learning techniques either rely on model uncertainty to select the most uncertain samples or use clustering or reconstruction to choose the most diverse set of unlabeled examples. While uncertainty-based strategies are susceptible to outliers, solely relying on sample diversity does not capture the information available on the main task. In this work, we develop a semi-supervised minimax entropy-based active learning algorithm that leverages both uncertainty and diversity in an adversarial manner. Our model consists of an entropy minimizing feature encoding network followed by an entropy maximizing classification layer. This minimax formulation reduces the distribution gap between the labeled/unlabeled data, while a discriminator is simultaneously trained to distinguish the labeled/unlabeled data. The highest entropy samples from the classifier that the discriminator predicts as unlabeled are selected for labeling. We evaluate our method on various image classification and semantic segmentation benchmark datasets and show superior performance over the state-of-the-art methods.

下载PDF全文

下载文献需遵守相关版权规定

论文标题