TY - JOUR
T1 - Towards Better Distortion Feature Learning for Object Detection in Top-View Fisheye Cameras
AU - Guo, Pengbo
AU - Liu, Chengxu
AU - Hou, Xingsong
AU - Qian, Xueming
N1 - Publisher Copyright:
© 1999-2012 IEEE.
PY - 2024
Y1 - 2024
N2 - With the development of deep learning in recent years, the performance of object detection under conventional cameras has been significantly improved. Nevertheless, due to the distortion caused by the fisheye cameras, detecting objects in this scenario remains a significant challenge. The dominant approaches focus on modifying the shape of the bounding box to better align the boundaries of the distorted object. However, these methods neglect the learning of spatial distortion information, which prevents them from satisfactory results. In this paper, we propose a novel fisheye camera detection network to learn distortion features better, dubbed SDANet. SDANet is composed of a series of SDABlocks, which are designed to learn spatial distortion features. Each SDABlock consists of multiple convolution kernels of different sizes, and it can generate the most suitable kernel based on the current input's distortion characteristics. Moreover, to address the limitations of the scarcity and uneven spatial distribution of fisheye image datasets on performance improvement, we propose a dedicated data augmentation strategy called Prominent Fisheye Distortion Augmentation (PFDAug). PFDAug can further introduce distortions to fisheye images, effectively alleviating these problems. Experimental results on the CEPDOF, MW-R, HABBOF, LOAF, and FishEye8k fisheye image datasets demonstrate that our method achieves state-ofthe-art performance.
AB - With the development of deep learning in recent years, the performance of object detection under conventional cameras has been significantly improved. Nevertheless, due to the distortion caused by the fisheye cameras, detecting objects in this scenario remains a significant challenge. The dominant approaches focus on modifying the shape of the bounding box to better align the boundaries of the distorted object. However, these methods neglect the learning of spatial distortion information, which prevents them from satisfactory results. In this paper, we propose a novel fisheye camera detection network to learn distortion features better, dubbed SDANet. SDANet is composed of a series of SDABlocks, which are designed to learn spatial distortion features. Each SDABlock consists of multiple convolution kernels of different sizes, and it can generate the most suitable kernel based on the current input's distortion characteristics. Moreover, to address the limitations of the scarcity and uneven spatial distribution of fisheye image datasets on performance improvement, we propose a dedicated data augmentation strategy called Prominent Fisheye Distortion Augmentation (PFDAug). PFDAug can further introduce distortions to fisheye images, effectively alleviating these problems. Experimental results on the CEPDOF, MW-R, HABBOF, LOAF, and FishEye8k fisheye image datasets demonstrate that our method achieves state-ofthe-art performance.
KW - Data Augmentation
KW - Distortion
KW - Fisheye Images
KW - Object Detection
UR - https://www.scopus.com/pages/publications/85213888762
U2 - 10.1109/TMM.2024.3521808
DO - 10.1109/TMM.2024.3521808
M3 - 文章
AN - SCOPUS:85213888762
SN - 1520-9210
JO - IEEE Transactions on Multimedia
JF - IEEE Transactions on Multimedia
ER -