TY - GEN
T1 - Learning Segmented 3D Gaussians via Efficient Feature Unprojection for Zero-Shot Neural Scene Segmentation
AU - Dou, Bin
AU - Zhang, Tianyu
AU - Wang, Zhaohui
AU - Ma, Yongjia
AU - Yuan, Zejian
AU - Zheng, Nanning
N1 - Publisher Copyright:
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2025.
PY - 2025
Y1 - 2025
N2 - Zero-shot neural scene segmentation, which reconstructs 3D neural segmentation field without manual annotations, serves as an effective way for scene understanding. However, existing models, especially the efficient 3D Gaussian-based methods, struggle to produce compact segmentation results. This issue stems primarily from their redundant learnable attributes assigned on individual Gaussians, leading to a lack of robustness against the 3D-inconsistencies in zero-shot generated raw labels. To address this problem, our work, named Compact Segmented 3D Gaussians (CoSegGaussians), proposes the Feature Unprojection and Fusion module as the segmentation field, which utilizes a shallow decoder generalizable for all Gaussians based on high-level features. Specifically, leveraging the learned Gaussian geometric parameters, semantic-aware image-based features are introduced into the scene via our unprojection technique. The lifted features, together with spatial information, are fed into the multi-scale aggregation decoder to generate segmentation identities for all Gaussians. Furthermore, we design CoSeg Loss to boost model robustness against 3D-inconsistent noises. Experimental results show that our model surpasses baselines on zero-shot semantic segmentation task,improving by ∼10% mIoU over the best baseline. Code and more results will be available at https://David-Dou.github.io/CoSegGaussians.
AB - Zero-shot neural scene segmentation, which reconstructs 3D neural segmentation field without manual annotations, serves as an effective way for scene understanding. However, existing models, especially the efficient 3D Gaussian-based methods, struggle to produce compact segmentation results. This issue stems primarily from their redundant learnable attributes assigned on individual Gaussians, leading to a lack of robustness against the 3D-inconsistencies in zero-shot generated raw labels. To address this problem, our work, named Compact Segmented 3D Gaussians (CoSegGaussians), proposes the Feature Unprojection and Fusion module as the segmentation field, which utilizes a shallow decoder generalizable for all Gaussians based on high-level features. Specifically, leveraging the learned Gaussian geometric parameters, semantic-aware image-based features are introduced into the scene via our unprojection technique. The lifted features, together with spatial information, are fed into the multi-scale aggregation decoder to generate segmentation identities for all Gaussians. Furthermore, we design CoSeg Loss to boost model robustness against 3D-inconsistent noises. Experimental results show that our model surpasses baselines on zero-shot semantic segmentation task,improving by ∼10% mIoU over the best baseline. Code and more results will be available at https://David-Dou.github.io/CoSegGaussians.
KW - 3D Gaussian Splatting
KW - Neural Scene Segmentation
KW - Novel View Segmentation
KW - Zero-shot Segmentation
UR - https://www.scopus.com/pages/publications/105008638267
U2 - 10.1007/978-981-96-6596-9_28
DO - 10.1007/978-981-96-6596-9_28
M3 - 会议稿件
AN - SCOPUS:105008638267
SN - 9789819665983
T3 - Lecture Notes in Computer Science
SP - 398
EP - 412
BT - Neural Information Processing - 31st International Conference, ICONIP 2024, Proceedings
A2 - Mahmud, Mufti
A2 - Doborjeh, Maryam
A2 - Wong, Kevin
A2 - Leung, Andrew Chi Sing
A2 - Doborjeh, Zohreh
A2 - Tanveer, M.
PB - Springer Science and Business Media Deutschland GmbH
T2 - 31st International Conference on Neural Information Processing, ICONIP 2024
Y2 - 2 December 2024 through 6 December 2024
ER -