跳到主要导航 跳到搜索 跳到主要内容

FlowRAM: Grounding Flow Matching Policy with Region-Aware Mamba Framework for Robotic Manipulation

  • Xi'an Jiaotong University
  • University of Illinois at Chicago

科研成果: 期刊稿件会议文章同行评审

1 引用 (Scopus)

摘要

Robotic manipulation in high-precision tasks is essential for numerous industrial and real-world applications where accuracy and speed are required. Yet current diffusion-based policy learning methods generally suffer from low computational efficiency due to the iterative denoising process during inference. Moreover, these methods do not fully explore the potential of generative models for enhancing information exploration in 3D environments. In response, we propose FlowRAM, a novel framework that leverages generative models to achieve region-aware perception, enabling efficient multimodal information processing. Specifically, we devise a Dynamic Radius Schedule, which allows adaptive perception, facilitating transitions from global scene comprehension to fine-grained geometric details. Furthermore, we integrate state space models to integrate multimodal information, while preserving linear computational complexity. In addition, we employ conditional flow matching to learn action poses by regressing deterministic vector fields, simplifying the learning process while maintaining performance. We verify the effectiveness of the FlowRAM in the RLBench, an established manipulation benchmark, and achieve state-of-the-art performance. The results demonstrate that FlowRAM achieves a remarkable improvement, particularly in high-precision tasks, where it outperforms previous methods by 12.0% in average success rate. Additionally, FlowRAM is able to generate physically plausible actions for a variety of real-world tasks in less than 4 time steps, significantly increasing inference speed.

源语言英语
页(从-至)12176-12186
页数11
期刊Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
DOI
出版状态已出版 - 2025
活动2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2025 - Nashville, 美国
期限: 11 6月 202515 6月 2025

学术指纹

探究 'FlowRAM: Grounding Flow Matching Policy with Region-Aware Mamba Framework for Robotic Manipulation' 的科研主题。它们共同构成独一无二的指纹。

引用此