Skip to main navigation Skip to search Skip to main content

BAP-DETR: Efficient drone object detection network based on bipartite attentive processing and dual fusion encoder

  • Xi'an Jiaotong University

Research output: Contribution to journalArticlepeer-review

3 Scopus citations

Abstract

Object detection in drone aerial imagery faces critical challenges including extreme scale variance, clustered small objects, and complex backgrounds, leading to notable performance gaps in general detectors. The most effective solution is to increase the input resolution, but this substantially increases computational load. Existing methods are unable to achieve a satisfactory balance between accuracy and speed due to architectural inadequacies in preserving fine-grained features essential for small objects. Thus, we present an optimized model architecture based on the RT-DETR framework. By proposing the Bipartite Attentive Processing Block, which employs a channel-splitting strategy that allows parallel convolution and attention refinement, we improve the model’s ability to extract discriminative features from complex aerial images. A novel dual-fusion encoder with a Frequency-Aware Fusion Module further improves the model’s performance by retaining critical low-level features while effectively merging them with high-level semantic information. Additionally, we optimize the loss function by combining the Reciprocal Normalized Wasserstein Distance with CIoU. Extensive experiments on the VisDrone, UAVDT and AI-TOD datasets demonstrate the efficiency and effectiveness of our method. In particular, our method achieves a 6.9% higher AP than the baseline, requires 17.5% less computational load and provides superior accuracy compared to state-of-the-art methods.

Original languageEnglish
Article number104565
JournalComputer Vision and Image Understanding
Volume262
DOIs
StatePublished - Dec 2025

Keywords

  • Channel-splitting
  • Clustered small objects
  • Drone object detection
  • Feature fusion

Fingerprint

Dive into the research topics of 'BAP-DETR: Efficient drone object detection network based on bipartite attentive processing and dual fusion encoder'. Together they form a unique fingerprint.

Cite this