Fully corrective gradient boosting with squared hinge: Fast learning rates and early stopping

Research output: Contribution to journalArticlepeer-review

16 Scopus citations

Abstract

In this paper, we propose an efficient boosting method with theoretical guarantees for binary classification. There are three key ingredients of the proposed boosting method: a fully corrective greedy (FCG) update, a differentiable squared hinge (also called truncated quadratic) loss function, and an efficient alternating direction method of multipliers (ADMM) solver. Compared with traditional boosting methods, on one hand, the FCG update accelerates the numerical convergence rate, and on the other hand, the squared hinge loss inherits the robustness of the hinge loss for classification and maintains the theoretical benefits of the square loss in regression. The ADMM solver with guaranteed fast convergence then provides an efficient implementation for the proposed boosting method. We conduct both theoretical analysis and numerical verification to show the outperformance of the proposed method. Theoretically, a fast learning rate of order O((m/logm)−1/2) is proved under certain standard assumptions, where m is the size of sample set. Numerically, a series of toy simulations and real data experiments are carried out to verify the developed theory.

Original languageEnglish
Pages (from-to)136-151
Number of pages16
JournalNeural Networks
Volume147
DOIs
StatePublished - Mar 2022

Keywords

  • Boosting
  • Early stopping
  • Fully corrective greedy
  • Learning theory
  • Squared hinge

Fingerprint

Dive into the research topics of 'Fully corrective gradient boosting with squared hinge: Fast learning rates and early stopping'. Together they form a unique fingerprint.

Cite this