用嘈杂的用户反馈联合学习

论文标题

用嘈杂的用户反馈联合学习

Federated Learning with Noisy User Feedback

论文作者

Sharma, Rahul, Ramakrishna, Anil, MacLaughlin, Ansel, Rumshisky, Anna, Majmudar, Jimit, Chung, Clement, Avestimehr, Salman, Gupta, Rahul

论文摘要

机器学习（ML）系统越来越流行，并在我们的日常生活中推动越来越多的应用和服务。这导致人们对用户隐私的关注日益加剧，因为通常需要将人类交互数据传输到云中才能训练和改进此类系统。联合学习（FL）最近已成为一种使用敏感用户数据在边缘设备上训练ML模型的方法，被视为减轻对数据隐私的关注的一种方式。但是，由于ML模型最常接受标签监督培训，因此我们需要一种方法来提取边缘的标签以使FL可行。在这项工作中，我们提出了一种使用正面和负面用户反馈的培训FL模型的策略。我们还设计了一个新颖的框架，以研究用户反馈中的不同噪声模式，并探索在联合环境中训练模型时，标准的噪声目标能够帮助减轻这种噪音。我们通过对两个文本分类数据集进行详细实验评估了我们提出的培训设置，并分析了不同级别的用户可靠性和反馈噪声对模型性能的影响。我们表明，我们的方法在自我训练的基线上大大改善，从而使性能更接近接受全面监督训练的模型。

Machine Learning (ML) systems are getting increasingly popular, and drive more and more applications and services in our daily life. This has led to growing concerns over user privacy, since human interaction data typically needs to be transmitted to the cloud in order to train and improve such systems. Federated learning (FL) has recently emerged as a method for training ML models on edge devices using sensitive user data and is seen as a way to mitigate concerns over data privacy. However, since ML models are most commonly trained with label supervision, we need a way to extract labels on edge to make FL viable. In this work, we propose a strategy for training FL models using positive and negative user feedback. We also design a novel framework to study different noise patterns in user feedback, and explore how well standard noise-robust objectives can help mitigate this noise when training models in a federated setting. We evaluate our proposed training setup through detailed experiments on two text classification datasets and analyze the effects of varying levels of user reliability and feedback noise on model performance. We show that our method improves substantially over a self-training baseline, achieving performance closer to models trained with full supervision.

下载PDF全文

下载文献需遵守相关版权规定

论文标题