Skip to main navigation Skip to search Skip to main content

GaussBEV: Multi-head Gaussian Feature Grouping for Multi-view 3D Object Detection

  • Xi'an Jiaotong University

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

In recent years, camera-only 3D object detectors have made significant progress, largely fueled by the adoption of Bird’s-Eye-View (BEV) representation. However, a notable limitation still exists: BEV representations weaken the height-dimensional information, especially in processes like voxel-pooling, which flatten 3D voxel features into 2D plane directly to build BEV feature. To address this issue, we propose a novel and effective method termed GaussBEV, which departs from the conventional construction of BEV feature, instead, it commences by introducing slice-voxel-pooling to reserve height information and categorizing objects into different groups based on statistics for differential processing. Utilizing the unique spatial distributions within each group, we design a Gaussian Weight Generator (GWG) module, which reweights voxel feature based on learnable Gaussian parameters, thereby generating group features, retaining the corresponding group-wise height information to a great extent. Subsequently, an Efficient Channel Attention (ECA) FPN is introduced to bring global feature, which can further be combined with the group features to capture both the group spatial information and global semantics. This combination strategy ensures a comprehensive and detailed representation of the 3D environment. With the combined features, we use multiple detection heads for specific groups, where each head focuses on the feature-constructing procedure of the corresponding group. Extensive experiments and thorough analysis of the nuScenes dataset have been conducted to validate the effectiveness of GaussBEV.

Original languageEnglish
Title of host publicationArtificial Intelligence Applications and Innovations - 21st IFIP WG 12.5 International Conference, AIAI 2025, Proceedings
EditorsIlias Maglogiannis, Lazaros Iliadis, Antonios Papaleonidas, Andreas Andreou
PublisherSpringer Science and Business Media Deutschland GmbH
Pages258-271
Number of pages14
ISBN (Print)9783031962387
DOIs
StatePublished - 2025
Event21st IFIP WG 12.5 International Conference on Artificial Intelligence Applications and Innovations, AIAI 2025 - Limassol, Cyprus
Duration: 26 Jun 202529 Jun 2025

Publication series

NameIFIP Advances in Information and Communication Technology
Volume755 IFIPAICT
ISSN (Print)1868-4238
ISSN (Electronic)1868-422X

Conference

Conference21st IFIP WG 12.5 International Conference on Artificial Intelligence Applications and Innovations, AIAI 2025
Country/TerritoryCyprus
CityLimassol
Period26/06/2529/06/25

Keywords

  • 3D object detection
  • Autonomous Driving
  • Height-aware feature enhancement

Fingerprint

Dive into the research topics of 'GaussBEV: Multi-head Gaussian Feature Grouping for Multi-view 3D Object Detection'. Together they form a unique fingerprint.

Cite this