Skip to main navigation Skip to search Skip to main content

Point-RMAE: Reinforcement Masked Autoencoder for 3D Representation Learning

  • Haozhe Cheng
  • , Lintong Wei
  • , Wenjing Wang
  • , Wenbiao Yan
  • , Jinqian Chen
  • , Jian Lu
  • , Kun Yue
  • , Jihua Zhu
  • Xi'an Jiaotong University
  • Xi'an Polytechnic University
  • Yunnan University

Research output: Contribution to journalArticlepeer-review

Abstract

The Mainstream 3D masked point modeling representation learning community typically employs predefined, fixed-ratio random or block masking strategies, aiming to obtain optimal representations and achieve high downstream performance. However, these empirical designs overlook the significant geometric information and structural importance differences that are inherent among different 3D points, leading to a suboptimal trade-off between the representation capture capabilities and reconstruction difficulty of such masking strategies. To address this issue, we are the first to present this decision-making problem to a reinforcement learning agent and propose a Reinforcement Masked Autoencoder for 3D representation learning, named Point-RMAE. Guided by geometric features as state factor, this method leverages the Masking Strategy Analyzer and the Dynamic Masking Generator to adaptively decide and apply the masking strategy during pretraining. The Masking Ratio Scheduling module dynamically adjusts the masking ratio based on the optimal strategy. Subsequently, the analyzer is updated by multiscale rewards derived from reconstruction quality level, distribution-aware feedback, and policy exploration. Notably, to enrich the Reward Function with distribution-aware signals and avoid decision collapse issue, we propose a Flow Matching Point Cloud Fast Generator that guides the selected masking decisions. Our method achieves outstanding performance across downstream tasks such as shape classification, medical diagnosis, object detection, action recognition, denoising and multiscale scene segmentation on ten popular 3D and 4D datasets. More importantly, Point-RMAE pioneers the application of reinforcement learning in 3D self-supervised representation learning.

Original languageEnglish
JournalIEEE Transactions on Image Processing
DOIs
StateAccepted/In press - 2026
Externally publishedYes

Keywords

  • 3D point cloud
  • reinforcement learning
  • representation learning
  • self-supervised network

Fingerprint

Dive into the research topics of 'Point-RMAE: Reinforcement Masked Autoencoder for 3D Representation Learning'. Together they form a unique fingerprint.

Cite this