Decentralized Learning Made Easy with DecentralizePy

Miloš Vujasinović

doi:10.1145/3578356.3592587

Outline

Decentralized Learning Made Easy with DecentralizePy

Miloš Vujasinović

2023

https://doi.org/10.1145/3578356.3592587

visibility

…

description

8 pages

link

1 file

Abstract

Decentralized learning (DL) has gained prominence for its potential benefits in terms of scalability, privacy, and fault tolerance. It consists of many nodes that coordinate without a central server and exchange millions of parameters in the inherently iterative process of machine learning (ML) training. In addition, these nodes are connected in complex and potentially dynamic topologies. Assessing the intricate dynamics of such networks is clearly not an easy task. Often in literature, researchers resort to simulated environments that do not scale and fail to capture practical and crucial behaviors, including the ones associated to parallelism, data transfer, network delays, and wall-clock time. In this paper, we propose decentralizepy, a distributed framework for decentralized ML, which allows for the emulation of large-scale learning networks in arbitrary topologies. We demonstrate the capabilities of decentralizepy by deploying techniques such as sparsification and secure aggregation on top of several topologies, including dynamic networks with more than one thousand nodes. CCS Concepts: • Networks → Programming interfaces; • Computing methodologies → Distributed algorithms; Machine learning algorithms; • Computer systems organization → Peer-to-peer architectures.

References (39)

Martín Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irv- ing, Michael Isard, Manjunath Kudlur, Josh Levenberg, Rajat Monga, Sherry Moore, Derek G. Murray, Benoit Steiner, Paul Tucker, Vijay Vasudevan, Pete Warden, Martin Wicke, Yuan Yu, and Xiaoqiang Zheng. 2016. TensorFlow: a system for Large-Scale machine learn- ing (OSDI'16). https://www.usenix.org/conference/osdi16/technical- sessions/presentation/abadi
Dan Alistarh, Demjan Grubic, Jerry Z. Li, Ryota Tomioka, and Milan Vojnovic. 2017. QSGD: Communication-Efficient SGD via Gradient Quantization and Encoding (NIPS'17). https://proceedings.neurips.cc/paper_files/paper/2017/file/ 6c340f25839e6acdc73414517203f5f0-Paper.pdf
Dan Alistarh, Torsten Hoefler, Mikael Johansson, Sarit Khirirat, Nikola Konstantinov, and Cédric Renggli. 2018. The Convergence of Sparsified Gradient Methods (NIPS'18). https://proceedings.neurips.cc/paper_ files/paper/2018/file/314450613369e0ee72d0da7f6fee773c-Paper.pdf
Batiste Le Bars, Aurélien Bellet, Marc Tommasi, Erick Lavoie, and Anne-Marie Kermarrec. 2023. Refined Convergence and Topology Learning for Decentralized Optimization with Heterogeneous Data (AISTATS'23). arXiv:2204.04452
Aurélien Bellet, Anne-Marie Kermarrec, and Erick Lavoie. 2022. D- Cliques: Compensating for Data Heterogeneity with Topology in De- centralized Federated Learning (SRDS'22). https://doi.org/10.1109/ SRDS55811.2022.00011
Juan Benet. 2014. IPFS -Content Addressed, Versioned, P2P File System. CoRR (2014). arXiv:1407.3561
Daniel J Beutel, Taner Topal, Akhil Mathur, Xinchi Qiu, Titouan Parcol- let, and Nicholas D Lane. 2020. Flower: A Friendly Federated Learning Research Framework. (2020). arXiv:2007.14390
Luca Boccassi et al. 2023. ZeroMQ: An open-source universal messag- ing library. https://zeromq.org
Keith Bonawitz, Hubert Eichner, Wolfgang Grieskamp, et al. 2019. Towards Federated Learning at Scale: System Design (ML- Sys'19). https://proceedings.mlsys.org/paper_files/paper/2019/file/ bd686fd640be98efaae0091fa301e613-Paper.pdf
Keith Bonawitz, Vladimir Ivanov, Ben Kreuter, Antonio Marcedone, H. Brendan McMahan, Sarvar Patel, Daniel Ramage, Aaron Segal, and Karn Seth. 2017. Practical Secure Aggregation for Privacy-Preserving Machine Learning (CCS '17). https://doi.org/10.1145/3133956.3133982
Sebastian Caldas, Peter Wu, Tian Li, Jakub Konečný, H. Brendan McMa- han, Virginia Smith, and Ameet Talwalkar. 2019. Leaf: A benchmark for federated settings. In 2nd Intl. Workshop on Federated Learning for Data Privacy and Confidentiality (FL-NeurIPS'19). arXiv:1812.01097
Akash Dhasade, Nevena Dresevic, Anne-Marie Kermarrec, and Rafael Pires. 2022. TEE-based decentralized recommender systems: The raw data sharing redemption (IPDPS'22). https://doi.org/10.1109/ IPDPS53621.2022.00050
Paulo Gouveia, João Neves, Carlos Segarra, Luca Liechti, Shady Issa, Valerio Schiavoni, and Miguel Matos. 2020. Kollaps: Decentralized and Dynamic Topology Emulation (EuroSys '20). Article 23. https: //doi.org/10.1145/3342195.3387540
Chaoyang He, Songze Li, Jinhyun So, Mi Zhang, Hongyi Wang, Xi- aoyang Wang, Praneeth Vepakomma, Abhishek Singh, Hang Qiu, Li Shen, Peilin Zhao, Yan Kang, Yang Liu, Ramesh Raskar, Qiang Yang, Murali Annavaram, and Salman Avestimehr. 2020. FedML: A re- search library and benchmark for federated machine learning. (2020). arXiv:2007.13518
Kevin Hsieh, Amar Phanishayee, Onur Mutlu, and Phillip B. Gibbons. 2020. The Non-IID Data Quagmire of Decentralized Machine Learning. In Proceedings of the 37th International Conference on Machine Learning (ICML'20). Article 408. http://proceedings.mlr.press/v119/hsieh20a/ hsieh20a.pdf
Márk Jelasity, Spyros Voulgaris, Rachid Guerraoui, Anne-Marie Ker- marrec, and Maarten Van Steen. 2007. Gossip-based peer sampling. ACM Transactions on Computer Systems (TOCS) 25, 3 (2007), 8-es.
Peter Kairouz, H. Brendan McMahan, Brendan Avent, Aurélien Bellet, Mehdi Bennis, Arjun Nitin Bhagoji, Kallista Bonawitz, Zachary Charles, Graham Cormode, Rachel Cummings, Rafael G. L. D'Oliveira, Hubert Eichner, Salim El Rouayheb, David Evans, Josh Gardner, Zachary Gar- rett, Adrià Gascón, Badih Ghazi, Phillip B. Gibbons, Marco Gruteser, Zaid Harchaoui, Chaoyang He, Lie He, Zhouyuan Huo, Ben Hutchin- son, Justin Hsu, Martin Jaggi, Tara Javidi, Gauri Joshi, Mikhail Kho- dak, Jakub Konecný, Aleksandra Korolova, Farinaz Koushanfar, Sanmi Koyejo, Tancrède Lepoint, Yang Liu, Prateek Mittal, Mehryar Mohri, Richard Nock, Ayfer Özgür, Rasmus Pagh, Hang Qi, Daniel Ramage, Ramesh Raskar, Mariana Raykova, Dawn Song, Weikang Song, Sebas- tian U. Stich, Ziteng Sun, Ananda Theertha Suresh, Florian Tramèr, Praneeth Vepakomma, Jianyu Wang, Li Xiong, Zheng Xu, Qiang Yang, Felix X. Yu, Han Yu, and Sen Zhao. 2020. Advances and open problems in federated learning. Foundations and Trends in Machine Learning 14, 1-2 (2020). https://doi.org/10.1561/2200000083
Anastasia Koloskova, Tao Lin, Sebastian U Stich, and Martin Jaggi. 2020. Decentralized Deep Learning with Arbitrary Communication Com- pression (ICLR'20). https://openreview.net/forum?id=SkgGCkrKvH
Anastasia Koloskova, Nicolas Loizou, Sadra Boreiri, Martin Jaggi, and Sebastian Stich. 2020. A Unified Theory of Decentralized SGD with Changing Topology and Local Updates (ICML'20). https://proceedings. mlr.press/v119/koloskova20a.html
Anastasia Koloskova, Sebastian Stich, and Martin Jaggi. 2019. De- centralized stochastic optimization and gossip algorithms with com- pressed communication (ICML'19). https://proceedings.mlr.press/v97/ koloskova19a.html
Alex Krizhevsky, Vinod Nair, and Geoffrey Hinton. 2014. The CIFAR-10 dataset. 55, 5 (2014). https://www.cs.toronto.edu/~kriz/cifar.html
Fan Lai, Yinwei Dai, Sanjay Singapuram, et al. 2022. FedScale: Bench- marking Model and System Performance of Federated Learning at Scale (ICML'22). https://proceedings.mlr.press/v162/lai22a.html
Xiangru Lian, Ce Zhang, Huan Zhang, Cho-Jui Hsieh, Wei Zhang, and Ji Liu. 2017. Can Decentralized Algorithms Outperform Centralized Algorithms? A Case Study for Decentralized Parallel Stochastic Gra- dient Descent (NIPS'17). https://proceedings.neurips.cc/paper_files/ paper/2017/file/f75526659f31040afeb61cb7133e4e6d-Paper.pdf
Yujun Lin, Song Han, Huizi Mao, Yu Wang, and Bill Dally. 2018. Deep Gradient Compression: Reducing the Communication Bandwidth for Distributed Training (ICLR'18). https://openreview.net/forum?id= SkhQHMW0W
Yang Liu, Tao Fan, Tianjian Chen, Qian Xu, and Qiang Yang. 2021. FATE: An Industrial Grade Platform for Collaborative Learning With Data Protection. J. Mach. Learn. Res. 22, 226 (2021). http://jmlr.org/ papers/v22/20-815.html
Brendan McMahan, Eider Moore, Daniel Ramage, Seth Hampson, and Blaise Aguera y Arcas. 2017. Communication-efficient learning of deep networks from decentralized data (AISTATS'17). https://proceedings. mlr.press/v54/mcmahan17a/mcmahan17a.pdf
Message Passing Interface Forum. 2021. MPI: A Message-Passing In- terface Standard Version 4.0. https://www.mpi-forum.org/docs/mpi- 4.0/mpi40-report.pdf
Christodoulos Pappas, Dimitris Chatzopoulos, Spyros Lalis, and Manolis Vavalis. 2021. IPLS: A Framework for Decentralized Fed- erated Learning (IFIP Networking'21). https://doi.org/10.23919/ IFIPNetworking52078.2021.9472790
Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Brad- bury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, Alban Desmaison, Andreas Kopf, Edward Yang, Zachary DeVito, Martin Raison, Alykhan Tejani, Sasank Chilamkurthy, Benoit Steiner, Lu Fang, Junjie Bai, and Soumith Chintala. 2019. PyTorch: An Imperative Style, High-Performance Deep Learning Library. In NeurIPS'19. https://proceedings.neurips.cc/paper_files/paper/2019/ file/bdbca288fee7f92f2bfa9f7012727740-Paper.pdf
Holger R Roth, Yan Cheng, Yuhong Wen, Isaac Yang, Ziyue Xu, YuanT- ing Hsieh, Kristopher Kersten, Ahmed Harouni, Can Zhao, Kevin Lu, Zhihong Zhang, Wenqi Li, Andriy Myronenko, Dong Yang, Sean Yang, Nicola Rieke, Abood Quraini, Chester Chen, Daguang Xu, Nic Ma, Prerna Dogra, Mona G Flores, and Andrew Feng. 2022. NVIDIA FLARE: Federated Learning from Simulation to Real-World. In Work- shop on Federated Learning: Recent Advances and New Challenges. https://openreview.net/forum?id=hD9QaIQTL_f
Rishi Sharma et al. 2022. decentralizepy: An open-source decen- tralized learning research framework. https://github.com/sacs- epfl/decentralizepy
Nikko Strom. 2015. Scalable distributed DNN training using commodity GPU cloud computing. In 16th Annual Conference of the International Speech Communication Association (INTER- SPEECH'15). https://www.isca-speech.org/archive_v0/interspeech_ 2015/papers/i15_1488.pdf
Thijs Vogels, Hadrien Hendrikx, and Martin Jaggi. 2022. Beyond spectral gap: the role of the topology in decentralized learning (NeurIPS'22). https://proceedings.neurips.cc/paper_files/paper/2022/ file/61162d94822d468ee6e92803340f2040-Paper-Conference.pdf
Thijs Vogels, Sai Praneeth Karimireddy, and Martin Jaggi. 2020. Prac- tical Low-Rank Communication Compression in Decentralized Deep Learning (NeurIPS'20). https://proceedings.neurips.cc/paper_files/ paper/2020/file/a376802c0811f1b9088828288eb0d3f0-Paper.pdf
Milos Vujasinovic. 2023. Secure Aggregation on Sparse Mod- els in Decentralized Learning Systems. Master's thesis. EPFL. https://www.epfl.ch/labs/sacs/wp-content/uploads/2023/02/Secure_ Aggregation_on_Sparse_Models_in_Decentralized_Learning_ Systems___Milos_Vujasinovic.pdf
Lin Xiao, Stephen Boyd, and Seung-Jean Kim. 2007. Distributed average consensus with least-mean-square deviation. J. Parallel and Distrib. Comput. 67, 1 (2007). https://doi.org/10.1016/j.jpdc.2006.08.010
Timothy Yang, Galen Andrew, Hubert Eichner, Haicheng Sun, Wei Li, Nicholas Kong, Daniel Ramage, and Françoise Beaufays. 2018. Applied federated learning: Improving google keyboard query suggestions. (2018). arXiv:1812.02903
Tongtian Zhu, Fengxiang He, Lan Zhang, Zhengyang Niu, Mingli Song, and Dacheng Tao. 2022. Topology-aware generalization of decentralized SGD (ICML'22). https://proceedings.mlr.press/v162/ zhu22d.html
Alexander Ziller, Andrew Trask, Antonio Lopardo, et al. 2021. PySyft: A library for easy federated learning. In Federated Learning Systems. https://doi.org/10.1007/978-3-030-70604-3_5

Decentralized Learning Made Easy with DecentralizePy

Sign up for access to the world's latest research

Abstract

Related papers

References (39)

Related papers

Related topics