提取的BERT模型泄漏的信息比您想象的要多！

论文标题

提取的BERT模型泄漏的信息比您想象的要多！

Extracted BERT Model Leaks More Information than You Think!

论文作者

He, Xuanli, Chen, Chen, Lyu, Lingjuan, Xu, Qiongkai

论文摘要

大数据的收集和可用性，再加上预培训模型的进步（例如BERT），彻底改变了自然语言处理任务的预测性能。这使公司可以通过将基于BERT的模型封装为API来提供机器学习作为服务（MLAA）。由于商业兴趣，人们一直在尝试通过模型提取来窃取Re Mote服务。尽管以前的作品在防御模型提取攻击方面取得了进展，但关于它们在防止隐私泄漏方面的性能很少。这项工作通过针对提取的BERT模型启动属性推理攻击来弥合这一差距。我们广泛的实验表明，即使通过先进的防御策略促进了受害者模型，模型提取也可能导致严重的隐私泄漏。

The collection and availability of big data, combined with advances in pre-trained models (e.g. BERT), have revolutionized the predictive performance of natural language processing tasks. This allows corporations to provide machine learning as a service (MLaaS) by encapsulating fine-tuned BERT-based models as APIs. Due to significant commercial interest, there has been a surge of attempts to steal re mote services via model extraction. Although previous works have made progress in defending against model extraction attacks, there has been little discussion on their performance in preventing privacy leakage. This work bridges this gap by launching an attribute inference attack against the extracted BERT model. Our extensive experiments reveal that model extraction can cause severe privacy leakage even when victim models are facilitated with advanced defensive strategies.

下载PDF全文

下载文献需遵守相关版权规定

论文标题