Decentralized Learning, and in particular Federated Learning (FL), has recently emerged as a promising solution to collaboratively train predictive ML models without the need of disclosing possibly sensitive users’ raw data (e.g., user’s habits or preferences), paving the way for better privacy guarantees. In fact, FL leaves the data distributed and iteratively refines a global ML model by collecting and aggregating model updates locally computed by learners (i.e., FL participants).
Although collaborative learning systems can open up new opportunities for utilizing a wider range of data in data science pipelines, their implementation can be challenging due to the peculiarities of federated environments, including variations in data statistics among learners, the limited and diverse hardware of participating devices, the possibility of still leaking sensitive information during the process, and the complexity of optimizing the process in a distributed environment.