高度非i.d的联邦学习中的自适应人格。数据

论文标题

高度非i.d的联邦学习中的自适应人格。数据

Adaptive Personlization in Federated Learning for Highly Non-i.i.d. Data

论文作者

Yeganeh, Yousef, Farshad, Azade, Boschmann, Johann, Gaus, Richard, Frantzen, Maximilian, Navab, Nassir

论文摘要

联邦学习（FL）是一种分布式学习方法，它为医学机构提供了在全球模型中协作的前景，同时保留患者的隐私。尽管大多数医疗中心执行了类似的医学成像任务，但它们的差异（例如专业，患者数量和设备）导致了独特的数据分布。数据异质性对FL和本地模型的个性化构成了挑战。在这项工作中，我们研究了一种生产中间半全球模型的自适应分层聚类方法，因此具有相似数据分布的客户有机会形成更专业的模型。我们的方法形成了几个群集，这些集群由具有最相似数据分布的客户端组成；然后，每个集群继续分开训练。在集群中，我们使用元学习来改善参与者模型的个性化。我们通过评估我们在HAM10K数据集上的建议方法对皮肤病变分类和极端异质数据分布进行评估，将聚类方法与经典的FedAvg和集中式培训进行了比较。我们的实验表明，与标准的FL方法相比，分类精度中的异质分布的性能显着提高。此外，我们表明，如果在群集中应用，并且仅使用一小部分数据，则模型会更快地收敛，并且超过集中式培训。

Federated learning (FL) is a distributed learning method that offers medical institutes the prospect of collaboration in a global model while preserving the privacy of their patients. Although most medical centers conduct similar medical imaging tasks, their differences, such as specializations, number of patients, and devices, lead to distinctive data distributions. Data heterogeneity poses a challenge for FL and the personalization of the local models. In this work, we investigate an adaptive hierarchical clustering method for FL to produce intermediate semi-global models, so clients with similar data distribution have the chance of forming a more specialized model. Our method forms several clusters consisting of clients with the most similar data distributions; then, each cluster continues to train separately. Inside the cluster, we use meta-learning to improve the personalization of the participants' models. We compare the clustering approach with classical FedAvg and centralized training by evaluating our proposed methods on the HAM10k dataset for skin lesion classification with extreme heterogeneous data distribution. Our experiments demonstrate significant performance gain in heterogeneous distribution compared to standard FL methods in classification accuracy. Moreover, we show that the models converge faster if applied in clusters and outperform centralized training while using only a small subset of data.

下载PDF全文

下载文献需遵守相关版权规定

论文标题