Divide and conquer local average regression

Research output: Contribution to journalArticlepeer-review

36 Scopus citations

Abstract

The divide and conquer strategy, which breaks a massive data set into a series of manageable data blocks, and combines the independent results of data blocks to obtain a final decision, has been recognized as a state-of-the-art method to overcome challenges of massive data analysis. In this paper, we equip the classical local average regression with some divide and conquer strategies to infer the regressive relationship of input-output pairs from a massive data set. When the average mixture, a widely used divide and conquer approach, is adopted, we prove that the optimal learning rate can be achieved under some restrictive conditions on the number of data blocks. We then propose two variants to relax (or remove) these conditions and derive the same optimal learning rates as that for the average mixture local average regression. Our theoretical assertions are verified by a series of experimental studies.

Original languageEnglish
Pages (from-to)1326-1350
Number of pages25
JournalElectronic Journal of Statistics
Volume11
Issue number1
DOIs
StatePublished - 2017

Keywords

  • Divide and conquer strategy
  • K nearest neighbor estimate
  • Local average regression
  • Nadaraya-Watson estimate

Fingerprint

Dive into the research topics of 'Divide and conquer local average regression'. Together they form a unique fingerprint.

Cite this