Abstract
Structural information is an essential component for efficient object detection. In many visual detection tasks, the objects with large structural deformation usually make up a large proportion. The shape, contour, and internal structure of the objects tend toward dramatic change, which easily causes troubles for efficient object detection. Therefore, how to detect these objects robustly and accurately is one of the significant challenges. To address this issue, we introduce a Cross Stage Partial connections-based weighted Bi-directional Feature Pyramid Network (CSP-BiFPN), which allows easy and efficient multi-scale feature fusion by cross-stage partial connections. Second, to enhance the model's spatial transformation capacity, the multi-scale feature maps extracted from the YOLO backbone network are processed by an enhanced spatial transformation network (ESTN) for spatial deformations. Based on these architectural modifications and optimizations, we further develop a novel real-time robust object detection model called Bi-STN-YOLO. We evaluate the performance of the proposed method on four image datasets. The experimental results demonstrate that the proposed approach achieves significant improvements compared with the typical YOLO families and competitive performance compared to the state-of-the-arts in detection tasks.
| Original language | English |
|---|---|
| Pages (from-to) | 70-82 |
| Number of pages | 13 |
| Journal | Neurocomputing |
| Volume | 513 |
| DOIs | |
| State | Published - 7 Nov 2022 |
| Externally published | Yes |
Keywords
- Image detection
- Robust object detection
- Spatial transformation
- Structural deformation
Fingerprint
Dive into the research topics of 'Cross stage partial connections based weighted Bi-directional feature pyramid and enhanced spatial transformation network for robust object detection'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver