[ICLR 2018] Deep Gradient Compression: Reducing the Communication Bandwidth for Distributed Training
-
Updated
Aug 10, 2021 - Python
[ICLR 2018] Deep Gradient Compression: Reducing the Communication Bandwidth for Distributed Training
vector quantization for stochastic gradient descent.
[CCS 2021] "DataLens: Scalable Privacy Preserving Training via Gradient Compression and Aggregation" by Boxin Wang*, Fan Wu*, Yunhui Long*, Luka Rimanic, Ce Zhang, Bo Li
We present a set of all-reduce compatible gradient compression algorithms which significantly reduce the communication overhead while maintaining the performance of vanilla SGD. We empirically evaluate the performance of the compression methods by training deep neural networks on the CIFAR10 dataset.
Geometric median (GM) is a classical method in statistics for achieving a robust estimation of the uncorrupted data; under gross corruption, it achieves the optimal breakdown point of 0.5. However, its computational complexity makes it infeasible for robustifying stochastic gradient descent (SGD) for high-dimensional optimization problems. In th…
Add a description, image, and links to the gradient-compression topic page so that developers can more easily learn about it.
To associate your repository with the gradient-compression topic, visit your repo's landing page and select "manage topics."