TY - GEN
T1 - SS3D
T2 - 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022
AU - Liu, Chuandong
AU - Gao, Chenqiang
AU - Liu, Fangcen
AU - Liu, Jiang
AU - Meng, Deyu
AU - Gao, Xinbo
N1 - Publisher Copyright:
© 2022 IEEE.
PY - 2022
Y1 - 2022
N2 - Conventional deep learning based methods for 3D object detection require a large amount of 3D bounding box annotations for training, which is expensive to obtain in practice. Sparsely annotated object detection, which can largely reduce the annotations, is very challenging since the missing-annotated instances would be regarded as the background during training. In this paper, we propose a sparsely-supervised 3D object detection method, named SS3D. Aiming to eliminate the negative supervision caused by the missing annotations, we design a missing-annotated instance mining module with strict filtering strategies to mine positive instances. In the meantime, we design a reliable background mining module and a point cloud filling data augmentation strategy to generate the confident data for iteratively learning with reliable supervision. The proposed SS3D is a general framework that can be used to learn any modern 3D object detector. Extensive experiments on the KITTI dataset reveal that on different 3D detectors, the proposed SS3D framework with only 20% annotations required can achieve onpar performance comparing to fully-supervised methods. Comparing with the state-of-the-art semi-supervised 3D objection detection on KITTI, our SS3D improves the benchmarks by significant margins under the same annotation workload. Moreover, our SS3D also out-performs the state-of-the-art weakly-supervised method by remarkable margins, highlighting its effectiveness.
AB - Conventional deep learning based methods for 3D object detection require a large amount of 3D bounding box annotations for training, which is expensive to obtain in practice. Sparsely annotated object detection, which can largely reduce the annotations, is very challenging since the missing-annotated instances would be regarded as the background during training. In this paper, we propose a sparsely-supervised 3D object detection method, named SS3D. Aiming to eliminate the negative supervision caused by the missing annotations, we design a missing-annotated instance mining module with strict filtering strategies to mine positive instances. In the meantime, we design a reliable background mining module and a point cloud filling data augmentation strategy to generate the confident data for iteratively learning with reliable supervision. The proposed SS3D is a general framework that can be used to learn any modern 3D object detector. Extensive experiments on the KITTI dataset reveal that on different 3D detectors, the proposed SS3D framework with only 20% annotations required can achieve onpar performance comparing to fully-supervised methods. Comparing with the state-of-the-art semi-supervised 3D objection detection on KITTI, our SS3D improves the benchmarks by significant margins under the same annotation workload. Moreover, our SS3D also out-performs the state-of-the-art weakly-supervised method by remarkable margins, highlighting its effectiveness.
KW - 3D from multi-view and sensors
KW - Recognition: detection
KW - Robot vision
KW - categorization
KW - retrieval
UR - https://www.scopus.com/pages/publications/85143549359
U2 - 10.1109/CVPR52688.2022.00824
DO - 10.1109/CVPR52688.2022.00824
M3 - 会议稿件
AN - SCOPUS:85143549359
T3 - Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
SP - 8418
EP - 8427
BT - Proceedings - 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022
PB - IEEE Computer Society
Y2 - 19 June 2022 through 24 June 2022
ER -