Abstract
Pairwise algorithms refer to a learning problem with loss functions depending on pairs of examples. There has been remarkable work on analyzing their generalization properties in batch and online settings such as algorithmic stabilities, robustness or regularization. This paper is concerned with distributed pairwise algorithms for dealing with big data, based on a divide-and-conquer strategy. We show that the global estimator of the distributed pairwise algorithm is as good as that of the classical algorithm processing the whole data on a single machine. We present the optimal convergence rate for the distributed pairwise algorithm and provide a theoretical upper bound for the number of local machines under which the optimal rate is retained. Our analysis is achieved by the integral operator decomposition and distributed U-statistics.
| Original language | English |
|---|---|
| Pages (from-to) | 364-373 |
| Number of pages | 10 |
| Journal | Neurocomputing |
| Volume | 333 |
| DOIs | |
| State | Published - 14 Mar 2019 |
| Externally published | Yes |
Keywords
- Distributed method
- Gradient descent
- Pairwise algorithms
- Reproducing kernel Hilbert spaces