没有人留下：包容性联邦学习对异质设备

论文标题

没有人留下：包容性联邦学习对异质设备

No One Left Behind: Inclusive Federated Learning over Heterogeneous Devices

论文作者

Liu, Ruixuan, Wu, Fangzhao, Wu, Chuhan, Wang, Yanlin, Lyu, Lingjuan, Chen, Hong, Xie, Xing

论文摘要

联合学习（FL）是以隐私性的方式从分散数据培训全球模型的重要范式。现有的FL方法通常假定可以对任何参与客户端进行培训。但是，在实际应用中，客户的设备通常是异质的，并且具有不同的计算能力。尽管像伯特这样的大型模型在AI中取得了巨大的成功，但很难将它们应用于与弱客户的异质性FL。直接的解决方案（例如删除弱客户端或使用小型模型适合所有客户端）将带来一些问题，例如由于数据丢失或有限的模型表示能力而导致的掉落客户端的代表性不足和准确性较低。在这项工作中，我们提出了包容性FLL，这是一种包含客户的联合学习方法来解决此问题。包容性FL的核心思想是将不同尺寸的模型分配给具有不同计算功能的客户端，功能强大的客户型较大模型以及针对弱客户的较小客户。我们还提出了一种有效的方法，可以在多个具有不同尺寸的本地模型之间共享知识。这样，所有客户都可以参与FL中的模型学习，最终模型可以足够大和足够强大。此外，我们提出了一种动量知识蒸馏方法，以更好地转移强大客户的大型模型中的知识转移到弱客户的小型模型上。在许多实际基准数据集上进行了广泛的实验，证明了该方法在FL框架下从具有异质设备的客户中学习准确模型的有效性。

Federated learning (FL) is an important paradigm for training global models from decentralized data in a privacy-preserving way. Existing FL methods usually assume the global model can be trained on any participating client. However, in real applications, the devices of clients are usually heterogeneous, and have different computing power. Although big models like BERT have achieved huge success in AI, it is difficult to apply them to heterogeneous FL with weak clients. The straightforward solutions like removing the weak clients or using a small model to fit all clients will lead to some problems, such as under-representation of dropped clients and inferior accuracy due to data loss or limited model representation ability. In this work, we propose InclusiveFL, a client-inclusive federated learning method to handle this problem. The core idea of InclusiveFL is to assign models of different sizes to clients with different computing capabilities, bigger models for powerful clients and smaller ones for weak clients. We also propose an effective method to share the knowledge among multiple local models with different sizes. In this way, all the clients can participate in the model learning in FL, and the final model can be big and powerful enough. Besides, we propose a momentum knowledge distillation method to better transfer knowledge in big models on powerful clients to the small models on weak clients. Extensive experiments on many real-world benchmark datasets demonstrate the effectiveness of the proposed method in learning accurate models from clients with heterogeneous devices under the FL framework.

下载PDF全文

下载文献需遵守相关版权规定

论文标题