TY - JOUR
T1 - A novel adaptive weighted fusion network based on pixel level feature importance for two-stage 6D pose estimation
AU - Xiao, Haitao
AU - Ma, Linkun
AU - Li, Qinyao
AU - Ma, Shuo
AU - Guo, Hongxuan
AU - Wang, Wenjie
AU - Ogai, Harutoshi
N1 - Publisher Copyright:
© 2025 Elsevier B.V.
PY - 2025/8/14
Y1 - 2025/8/14
N2 - In intelligent industry, accurate recognition and localization of objects in an image is the basis for robots to perform autonomous and intelligent operations. With the rapid development and application of deep learning data fusion technology in pose estimation, the existing 6D pose estimation methods have made many achievements. However, most of the existing methods are not accurate enough to cope with scenes with cluttered backgrounds, inconspicuous textures, and occluded objects. In addition, the existing methods ignore the effect of the accuracy of instance segmentation on the accuracy of pose estimation. To address above issues, this paper proposes a two-stage 6D pose estimation method based on adaptive pixel-importance weighted fusion network with lightweight instance segmentation, named TAPWFusion. In the instance segmentation stage, a lightweight instance segmentation network based on multiscale attention and boundary constraints, named CVi-BC-YOLO, is proposed to improve segmentation accuracy and efficiency. In the pose estimation stage, to eliminate the interference of lighting and occlusion, and enhance the accuracy of the pose estimation, we propose an adaptive pixel-importance weighted fusion network, named APWFusion, which adaptively evaluates the importance of RGB color and the geometrical information of the point cloud. Experiments on LineMOD, YCB-Video and T-LESS datasets prove the advanced and effective nature of our proposed method.
AB - In intelligent industry, accurate recognition and localization of objects in an image is the basis for robots to perform autonomous and intelligent operations. With the rapid development and application of deep learning data fusion technology in pose estimation, the existing 6D pose estimation methods have made many achievements. However, most of the existing methods are not accurate enough to cope with scenes with cluttered backgrounds, inconspicuous textures, and occluded objects. In addition, the existing methods ignore the effect of the accuracy of instance segmentation on the accuracy of pose estimation. To address above issues, this paper proposes a two-stage 6D pose estimation method based on adaptive pixel-importance weighted fusion network with lightweight instance segmentation, named TAPWFusion. In the instance segmentation stage, a lightweight instance segmentation network based on multiscale attention and boundary constraints, named CVi-BC-YOLO, is proposed to improve segmentation accuracy and efficiency. In the pose estimation stage, to eliminate the interference of lighting and occlusion, and enhance the accuracy of the pose estimation, we propose an adaptive pixel-importance weighted fusion network, named APWFusion, which adaptively evaluates the importance of RGB color and the geometrical information of the point cloud. Experiments on LineMOD, YCB-Video and T-LESS datasets prove the advanced and effective nature of our proposed method.
KW - Adaptive weighted fusion
KW - Instance segmentation
KW - Object pose estimation
KW - Pixel-importance
UR - https://www.scopus.com/pages/publications/105005070297
U2 - 10.1016/j.neucom.2025.130371
DO - 10.1016/j.neucom.2025.130371
M3 - 文章
AN - SCOPUS:105005070297
SN - 0925-2312
VL - 642
JO - Neurocomputing
JF - Neurocomputing
M1 - 130371
ER -