Sparse k-means with 8/0 penalty for high-dimensional data clustering

Research output: Contribution to journalArticlepeer-review

14 Scopus citations

Abstract

One of the existing sparse clustering approaches, 1-k-means, maximizes the weighted between-cluster sum of squares subject to the 1 penalty. In this paper, we propose a sparse clustering method based on an 8/0 penalty, which we call 0-k-means. We design an efficient iterative algorithm for solving it. To compare the theoretical properties of 1 and 0-k-means, we show that they can be explained explicitly from a thresholding perspective based on different thresholding functions. Moreover, 1 and 0-k-means are proven to have a screening consistent property under Gaussian mixture models. Experiments on synthetic as well as real data justify the outperforming results of 0 with respect to 1-k-means.

Original languageEnglish
Pages (from-to)1265-1284
Number of pages20
JournalStatistica Sinica
Volume28
Issue number3
DOIs
StatePublished - Jul 2018

Keywords

  • High-dimensional data clustering
  • Screening property
  • Sparse k-means

Fingerprint

Dive into the research topics of 'Sparse k-means with 8/0 penalty for high-dimensional data clustering'. Together they form a unique fingerprint.

Cite this