Federated learning, also known as FL (great initials, BTW) has gained a lot of attention over the last couple of years as a way to train a model without needing to centralize data.
It’s a technique first brought to production by Google who used it to develop, train, and continuously improve Android’s predictive keyboard.
Here’s an interesting read on the topic.
Traditionally, training a model has required the data scientist to have direct access to a centralized data set. However, if a data set is dispersed and sensitive, centralizing it for training could be an insurmountable task. For example, in the case of GBoard, centralizing the data would require Google to have direct access to all of the keystrokes every user has performed. For many, this would be an invasion of their privacy and could erroneously enable Google to collect passwords, credit card numbers, and other sensitive text users are typing. Federated learning helps alleviate this problem by only sharing updates to a model instead of directly sharing the data used to train the model.