Generative Variational-Contrastive Learning for Self-Supervised Point Cloud Representation

Research output: Contribution to journalArticlepeer-review

7 Scopus citations

Abstract

Self-supervised representation learning for 3D point clouds has attracted increasing attention. However, existing methods in the field of 3D computer vision generally use fixed embeddings to represent the latent features, and impose hard constraints on the embeddings to make the latent feature values of the positive samples converge to consistency, which limits the ability of feature extractors to generalize over different data domains. To address this issue, we propose a Generative Variational-Contrastive Learning (GVC) model, where Gaussian distribution is used to construct a continuous, smoothed representation of the latent features. A distribution constraint and cross-supervision are constructed to improve the transfer ability of the feature extractor over synthetic and real-world data. Specifically, we design a variational contrastive module to constrain the feature distribution instead of feature values corresponding to each sample in the latent space. Moreover, a generative cross-supervision module is introduced to preserve the invariance features and promote the consistency of feature distribution among positive samples. Experimental results demonstrate that GVC achieves SOTA on different downstream tasks. In particular, with only pre-training on the synthetic dataset, GVC achieves a lead of 8.4% and 14.2% when transferring to the real-world dataset in the linear classification and few-shot classification.

Original languageEnglish
Pages (from-to)6154-6166
Number of pages13
JournalIEEE Transactions on Pattern Analysis and Machine Intelligence
Volume46
Issue number9
DOIs
StatePublished - 2024

Keywords

  • Contrastive learning
  • generative learning
  • point cloud
  • self-supervised
  • variational inference

Fingerprint

Dive into the research topics of 'Generative Variational-Contrastive Learning for Self-Supervised Point Cloud Representation'. Together they form a unique fingerprint.

Cite this