摘要
In decentralized optimization, m agents form a network and only communicate with their neighbors, which gives advantages in data ownership, privacy, and scalability. At the same time, decentralized stochastic gradient descent (SGD) methods, as popular decentralized algorithms for training large-scale machine learning models, have shown their superiority over centralized counterparts. Distributed stochastic gradient tracking (DSGT) (Pu & Nedić, 2021) has been recognized as the popular and state-of-the-art decentralized SGD method due to its proper theoretical guarantees. However, the theoretical analysis of DSGT (Koloskova et al., 2021) shows that its iteration complexity is (equation presented) where the doubly stochastic matrix W represents the network topology and CW is a parameter that depends on W. Thus, it indicates that the convergence property of DSGT is heavily affected by the topology of the communication network. To overcome the weakness of DSGT, we resort to the snapshot gradient tracking skill and propose two novel algorithms, snap-shot DSGT (SS DSGT) and accelerated snap-shot DSGT (ASS DSGT). We further justify that SS DSGT exhibits a lower iteration complexity compared to DSGT in the general communication network topology. Additionally, ASS DSGT matches DSGT's iteration complexity (equation presented) under the same conditions as DSGT. Numerical experiments validate SS DSGT's superior performance in the general communication network topology and exhibit better practical performance of ASS DSGT on the specified W compared to DSGT.
| 源语言 | 英语 |
|---|---|
| 页(从-至) | 10765-10791 |
| 页数 | 27 |
| 期刊 | Proceedings of Machine Learning Research |
| 卷 | 235 |
| 出版状态 | 已出版 - 2024 |
| 活动 | 41st International Conference on Machine Learning, ICML 2024 - Vienna, 奥地利 期限: 21 7月 2024 → 27 7月 2024 |
学术指纹
探究 'Double Stochasticity Gazes Faster: Snap-Shot Decentralized Stochastic Gradient Tracking Methods' 的科研主题。它们共同构成独一无二的指纹。引用此
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver