TY - GEN
T1 - GaussBEV
T2 - 21st IFIP WG 12.5 International Conference on Artificial Intelligence Applications and Innovations, AIAI 2025
AU - Zhang, Junjie
AU - Zhou, Sanping
AU - Huang, Yuhao
AU - Dong, Jinpeng
AU - Fu, Jingwen
AU - Zheng, Nanning
N1 - Publisher Copyright:
© IFIP International Federation for Information Processing 2025.
PY - 2025
Y1 - 2025
N2 - In recent years, camera-only 3D object detectors have made significant progress, largely fueled by the adoption of Bird’s-Eye-View (BEV) representation. However, a notable limitation still exists: BEV representations weaken the height-dimensional information, especially in processes like voxel-pooling, which flatten 3D voxel features into 2D plane directly to build BEV feature. To address this issue, we propose a novel and effective method termed GaussBEV, which departs from the conventional construction of BEV feature, instead, it commences by introducing slice-voxel-pooling to reserve height information and categorizing objects into different groups based on statistics for differential processing. Utilizing the unique spatial distributions within each group, we design a Gaussian Weight Generator (GWG) module, which reweights voxel feature based on learnable Gaussian parameters, thereby generating group features, retaining the corresponding group-wise height information to a great extent. Subsequently, an Efficient Channel Attention (ECA) FPN is introduced to bring global feature, which can further be combined with the group features to capture both the group spatial information and global semantics. This combination strategy ensures a comprehensive and detailed representation of the 3D environment. With the combined features, we use multiple detection heads for specific groups, where each head focuses on the feature-constructing procedure of the corresponding group. Extensive experiments and thorough analysis of the nuScenes dataset have been conducted to validate the effectiveness of GaussBEV.
AB - In recent years, camera-only 3D object detectors have made significant progress, largely fueled by the adoption of Bird’s-Eye-View (BEV) representation. However, a notable limitation still exists: BEV representations weaken the height-dimensional information, especially in processes like voxel-pooling, which flatten 3D voxel features into 2D plane directly to build BEV feature. To address this issue, we propose a novel and effective method termed GaussBEV, which departs from the conventional construction of BEV feature, instead, it commences by introducing slice-voxel-pooling to reserve height information and categorizing objects into different groups based on statistics for differential processing. Utilizing the unique spatial distributions within each group, we design a Gaussian Weight Generator (GWG) module, which reweights voxel feature based on learnable Gaussian parameters, thereby generating group features, retaining the corresponding group-wise height information to a great extent. Subsequently, an Efficient Channel Attention (ECA) FPN is introduced to bring global feature, which can further be combined with the group features to capture both the group spatial information and global semantics. This combination strategy ensures a comprehensive and detailed representation of the 3D environment. With the combined features, we use multiple detection heads for specific groups, where each head focuses on the feature-constructing procedure of the corresponding group. Extensive experiments and thorough analysis of the nuScenes dataset have been conducted to validate the effectiveness of GaussBEV.
KW - 3D object detection
KW - Autonomous Driving
KW - Height-aware feature enhancement
UR - https://www.scopus.com/pages/publications/105009902616
U2 - 10.1007/978-3-031-96239-4_19
DO - 10.1007/978-3-031-96239-4_19
M3 - 会议稿件
AN - SCOPUS:105009902616
SN - 9783031962387
T3 - IFIP Advances in Information and Communication Technology
SP - 258
EP - 271
BT - Artificial Intelligence Applications and Innovations - 21st IFIP WG 12.5 International Conference, AIAI 2025, Proceedings
A2 - Maglogiannis, Ilias
A2 - Iliadis, Lazaros
A2 - Papaleonidas, Antonios
A2 - Andreou, Andreas
PB - Springer Science and Business Media Deutschland GmbH
Y2 - 26 June 2025 through 29 June 2025
ER -