Skip to main navigation Skip to search Skip to main content

Sparse Regularization in Fuzzy $c$-Means for High-Dimensional Data Clustering

Research output: Contribution to journalArticlepeer-review

99 Scopus citations

Abstract

In high-dimensional data clustering practices, the cluster structure is commonly assumed to be confined to a limited number of relevant features, rather than the entire feature set. However, for high-dimensional data, identifying the relevant features and discovering the cluster structure are still challenging problems. To solve these problems, this paper proposes a novel fuzzy $ {c}$-means (FCM) model with sparse regularization ( $ {\ell-{q}(0<q\leq 1)}$-norm regularization), by reformulating the FCM objective function into the weighted between-cluster sum of square form and imposing the sparse regularization on the weights. An algorithm is also developed to explicitly solve the proposed model. Compared with the existing clustering models, the proposed model can shrink the weights of irrelevant features (noisy features) to exact zero, and also can be efficiently solved in analytic forms when $ {q=1,1/2}$. Experiments on both synthetic and real-world data sets show that the proposed approach outperforms the existing clustering approaches.

Original languageEnglish
Article number7763769
Pages (from-to)2616-2627
Number of pages12
JournalIEEE Transactions on Cybernetics
Volume47
Issue number9
DOIs
StatePublished - Sep 2017

Keywords

  • High-dimensional data clustering
  • e(0 < q < 1)-norm regularization
  • fuzzy c-means (FCM)

Fingerprint

Dive into the research topics of 'Sparse Regularization in Fuzzy $c$-Means for High-Dimensional Data Clustering'. Together they form a unique fingerprint.

Cite this