Skip to main navigation Skip to search Skip to main content

Towards Better Distortion Feature Learning for Object Detection in Top-View Fisheye Cameras

  • Xi'an Jiaotong University
  • Ltd.

Research output: Contribution to journalArticlepeer-review

1 Scopus citations

Abstract

With the development of deep learning in recent years, the performance of object detection under conventional cameras has been significantly improved. Nevertheless, due to the distortion caused by the fisheye cameras, detecting objects in this scenario remains a significant challenge. The dominant approaches focus on modifying the shape of the bounding box to better align the boundaries of the distorted object. However, these methods neglect the learning of spatial distortion information, which prevents them from satisfactory results. In this paper, we propose a novel fisheye camera detection network to learn distortion features better, dubbed SDANet. SDANet is composed of a series of SDABlocks, which are designed to learn spatial distortion features. Each SDABlock consists of multiple convolution kernels of different sizes, and it can generate the most suitable kernel based on the current input's distortion characteristics. Moreover, to address the limitations of the scarcity and uneven spatial distribution of fisheye image datasets on performance improvement, we propose a dedicated data augmentation strategy called Prominent Fisheye Distortion Augmentation (PFDAug). PFDAug can further introduce distortions to fisheye images, effectively alleviating these problems. Experimental results on the CEPDOF, MW-R, HABBOF, LOAF, and FishEye8k fisheye image datasets demonstrate that our method achieves state-ofthe-art performance.

Original languageEnglish
JournalIEEE Transactions on Multimedia
DOIs
StateAccepted/In press - 2024

Keywords

  • Data Augmentation
  • Distortion
  • Fisheye Images
  • Object Detection

Fingerprint

Dive into the research topics of 'Towards Better Distortion Feature Learning for Object Detection in Top-View Fisheye Cameras'. Together they form a unique fingerprint.

Cite this