TY - GEN
T1 - FuPaD
T2 - 18th International Conference on Intelligent Robotics and Applications, ICIRA 2025
AU - Qi, Dexin
AU - Tao, Tao
AU - Zhang, Zhihong
AU - Mei, Xuesong
N1 - Publisher Copyright:
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2026.
PY - 2026
Y1 - 2026
N2 - Pose estimation, a cornerstone of 3D computer vision, is crucial for applications such as autonomous driving and augmented reality. Global feed-forward methods, such as VGGT, demonstrate potential in direct scene reconstruction and pose inference. However, they are often constrained by prohibitive memory requirements when processing long sequences typical in large-scale environments. Furthermore, the accuracy of their single-pass predictions is often limited by the absence of explicit local geometric modeling or iterative refinement. To address these limitations, we introduce FuPaD, a novel hierarchical approach for scalable pose estimation. FuPaD integrates global pose priors derived from a tailored VGGT with the local refinement offered by dense bundle adjustment (DBA). First, a tracking-informed patch sampling strategy is introduced to select salient image patches from keyframes. These patches are subsequently processed by the tailored VGGT to yield globally consistent keyframe pose priors, meanwhile significantly reducing the memory footprint compared to frame-wise processing. These global keyframe poses are then integrated with dense local pose estimates from DBA within a pose graph optimization framework. Finally, a global DBA module further refines all poses. Such hierarchical fusion ensures the global consistency while benefiting from the fine-grained local refinement provided by DBA. Evaluation on benchmarks indicates that FuPaD achieves competitive pose accuracy, particularly in large-scale scenarios, while exhibiting computational and memory efficiency.
AB - Pose estimation, a cornerstone of 3D computer vision, is crucial for applications such as autonomous driving and augmented reality. Global feed-forward methods, such as VGGT, demonstrate potential in direct scene reconstruction and pose inference. However, they are often constrained by prohibitive memory requirements when processing long sequences typical in large-scale environments. Furthermore, the accuracy of their single-pass predictions is often limited by the absence of explicit local geometric modeling or iterative refinement. To address these limitations, we introduce FuPaD, a novel hierarchical approach for scalable pose estimation. FuPaD integrates global pose priors derived from a tailored VGGT with the local refinement offered by dense bundle adjustment (DBA). First, a tracking-informed patch sampling strategy is introduced to select salient image patches from keyframes. These patches are subsequently processed by the tailored VGGT to yield globally consistent keyframe pose priors, meanwhile significantly reducing the memory footprint compared to frame-wise processing. These global keyframe poses are then integrated with dense local pose estimates from DBA within a pose graph optimization framework. Finally, a global DBA module further refines all poses. Such hierarchical fusion ensures the global consistency while benefiting from the fine-grained local refinement provided by DBA. Evaluation on benchmarks indicates that FuPaD achieves competitive pose accuracy, particularly in large-scale scenarios, while exhibiting computational and memory efficiency.
KW - 3D Reconstruction
KW - Deep Learning Methods
KW - Deep Learning for Visual Perception
KW - Visual SLAM
UR - https://www.scopus.com/pages/publications/105020819486
U2 - 10.1007/978-981-95-2101-2_42
DO - 10.1007/978-981-95-2101-2_42
M3 - 会议稿件
AN - SCOPUS:105020819486
SN - 9789819521005
T3 - Lecture Notes in Computer Science
SP - 508
EP - 520
BT - Intelligent Robotics and Applications - 18th International Conference, ICIRA 2025, Proceedings
A2 - Matsuno, Takayuki
A2 - Liu, Honghai
A2 - Liu, Lianqing
A2 - Yin, Zhouping
A2 - Zhu, Xiangyang
A2 - Ren, Weihong
A2 - Wang, Zhiyong
A2 - Sheng, Yixuan
PB - Springer Science and Business Media Deutschland GmbH
Y2 - 6 August 2025 through 9 August 2025
ER -