# Federated Learning

[Slides](https://github.com/wangshusen/DeepLearning/blob/master/Slides/14_Parallel_4.pdf)

[Youtube](https://www.youtube.com/watch?v=STxtRucv_zo)

## Motivating Examples

![federated\_learning\_1](https://637078585-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MYsi-h_n0zY_8MKKgyu%2Fuploads%2Fgit-blob-920483183146da1183b96181643b06c4b1fbda7e%2Ffederated_learning_1.png?alt=media)

![federated\_learning\_2](https://637078585-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MYsi-h_n0zY_8MKKgyu%2Fuploads%2Fgit-blob-18274184c2aea36e4e879184c39edc21ad1f8984%2Ffederated_learning_2.png?alt=media)

## What is federated learning

Federated learning , \[1] is \[2] a kind of distributed learning.

How does federated learning differ from traditional distributed learning?

1. Users have control over their device and data.
2. Worker nodes are unstable.
3. Communication cost is higher than computation cost.
4. Data stored on worker nodes are not IID.
5. The amount of data is severely imbalanced.

## Let us recall parallel gradient descent

![federated\_learning\_3](https://637078585-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MYsi-h_n0zY_8MKKgyu%2Fuploads%2Fgit-blob-9cd49c9bce135c335b446ca43e1029995da1e7cc%2Ffederated_learning_3.png?alt=media)

![federated\_learning\_4](https://637078585-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MYsi-h_n0zY_8MKKgyu%2Fuploads%2Fgit-blob-436625e625fa7fa8f98d859aad8f5ed97a97e920%2Ffederated_learning_4.png?alt=media)

## Federated Averaging Algorithm

![federated\_learning\_5](https://637078585-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MYsi-h_n0zY_8MKKgyu%2Fuploads%2Fgit-blob-fb06728253166f3007257ffc344c6eec1c6e2486%2Ffederated_learning_5.png?alt=media)

![federated\_learning\_6](https://637078585-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MYsi-h_n0zY_8MKKgyu%2Fuploads%2Fgit-blob-91bca26ff642b1ef3e2c7a16884e62bd66eacfd5%2Ffederated_learning_6.png?alt=media)

## Computation vs. Communication

![federated\_learning\_7](https://637078585-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MYsi-h_n0zY_8MKKgyu%2Fuploads%2Fgit-blob-049aa6ffb082f83e6235cae2dae676022c1461f7%2Ffederated_learning_7.png?alt=media)

![federated\_learning\_8](https://637078585-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MYsi-h_n0zY_8MKKgyu%2Fuploads%2Fgit-blob-4fa7a27f4db1eaaffba6dfb098e8b0ff6eb0194e%2Ffederated_learning_8.png?alt=media)

## References

* \[1] McMahan and others: Communication-efficient learning of deep networks from decentralized data. In AISTATS, 2017. .
* \[2] Konevcny, McMahan, and Ramage: Federated optimization: distributed optimization beyond the datacenter. arXiv:1511.03575, 2015
