Clustered Federated Learning: Model-Agnostic Distributed Multitask Optimization Under Privacy Constraints
- Authors
- Sattler, Felix; Mueller, Klaus-Robert; Samek, Wojciech
- Issue Date
- 8월-2021
- Publisher
- IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
- Keywords
- Clustering; Data models; Optimization; Privacy; Servers; Sociology; Statistics; Training; distributed learning; federated learning; multi-task learning
- Citation
- IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, v.32, no.8, pp.3710 - 3722
- Indexed
- SCIE
SCOPUS
- Journal Title
- IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS
- Volume
- 32
- Number
- 8
- Start Page
- 3710
- End Page
- 3722
- URI
- https://scholar.korea.ac.kr/handle/2021.sw.korea/136962
- DOI
- 10.1109/TNNLS.2020.3015958
- ISSN
- 2162-237X
- Abstract
- Federated learning (FL) is currently the most widely adopted framework for collaborative training of (deep) machine learning models under privacy constraints. Albeit its popularity, it has been observed that FL yields suboptimal results if the local clients' data distributions diverge. To address this issue, we present clustered FL (CFL), a novel federated multitask learning (FMTL) framework, which exploits geometric properties of the FL loss surface to group the client population into clusters with jointly trainable data distributions. In contrast to existing FMTL approaches, CFL does not require any modifications to the FL communication protocol to be made, is applicable to general nonconvex objectives (in particular, deep neural networks), does not require the number of clusters to be known a priori, and comes with strong mathematical guarantees on the clustering quality. CFL is flexible enough to handle client populations that vary over time and can be implemented in a privacy-preserving way. As clustering is only performed after FL has converged to a stationary point, CFL can be viewed as a postprocessing method that will always achieve greater or equal performance than conventional FL by allowing clients to arrive at more specialized models. We verify our theoretical analysis in experiments with deep convolutional and recurrent neural networks on commonly used FL data sets.
- Files in This Item
- There are no files associated with this item.
- Appears in
Collections - Graduate School > Department of Artificial Intelligence > 1. Journal Articles
Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.