模型反演几乎没有射击

论文标题

模型反演几乎没有射击

Few-Shot Unlearning by Model Inversion

论文作者

Yoon, Youngsik, Nam, Jinhwan, Yun, Hyojeong, Lee, Jaeho, Kim, Dongwoo, Ok, Jungseul

论文摘要

我们考虑了一个实用的机器学习方案，以删除目标数据集，这会导致训练有素的模型出乎意料的行为。通常认为，在标准的未学习方案中，通常认为目标数据集是完全可识别的。但是，如果训练数据集在学习时无法访问，那么这种完美的身份几乎是不可能的。与以前需要一组目标集的方法不同，当仅提供几个目标数据样本时，我们考虑几乎没有射击的情况。为此，我们制定了几杆未学习的问题，该问题指定了未学习请求背后的意图（例如，纯学习，纯学习，标签校正，隐私保护），我们设计了一个简单的框架，该框架（i）检索训练数据的代理，通过模型反转的信息充分利用培训数据，可以在网络上充分利用可用的信息；（ii）根据未学习意图调整代理；（iii）使用调整后的代理更新模型。我们证明，仅使用目标数据子集的方法即使完全指示目标数据也可以超越最先进的学习方法。

We consider a practical scenario of machine unlearning to erase a target dataset, which causes unexpected behavior from the trained model. The target dataset is often assumed to be fully identifiable in a standard unlearning scenario. Such a flawless identification, however, is almost impossible if the training dataset is inaccessible at the time of unlearning. Unlike previous approaches requiring a complete set of targets, we consider few-shot unlearning scenario when only a few samples of target data are available. To this end, we formulate the few-shot unlearning problem specifying intentions behind the unlearning request (e.g., purely unlearning, mislabel correction, privacy protection), and we devise a straightforward framework that (i) retrieves a proxy of the training data via model inversion fully exploiting information available in the context of unlearning; (ii) adjusts the proxy according to the unlearning intention; and (iii) updates the model with the adjusted proxy. We demonstrate that our method using only a subset of target data can outperform the state-of-the-art unlearning methods even with a complete indication of target data.

下载PDF全文

下载文献需遵守相关版权规定

论文标题