TY - GEN
T1 - LightningSpike
T2 - 2023 China Automation Congress, CAC 2023
AU - Han, Zhonghui
AU - Li, Youcheng
AU - Xue, Jianru
N1 - Publisher Copyright:
© 2023 IEEE.
PY - 2023
Y1 - 2023
N2 - The cross-view geo-localization problem is to select the aerial-view image of the same geographic location given a ground-view image. Previous researches have focused on the accuracy of matching, which brings the problems of large models, hard to train and to deploy. We propose LightningSpike, a hybrid neural network system with the advantages of lightweight, fast training and inference speed for cross-view geo-localization problem. The system consists of two parts: cross-view image semantic pattern matching and brain-inspired metric learning. In the first part, the reconstruction and semantic segmentation network is applied to generate 3D semantic scene from the monocular ground-view image and get the top view of reconstructed scene. Besides, the aerial-view image is segmented to get the semantic image. We propose a module named 3 Dimensions Dynamic Similarity Matching (3DDSM) to match (shift, rotate and crop) semantic patterns of reconstructed top-view and aerial-view images to align the information between them. In the second part, we propose a lightweight Spiking Vision Multilayer Perceptron (SVMLP) to further learn metric based on matched semantic images. The proposed method is verified on the CV-KITTI dataset, a new dataset we build via randomly sampled from the KITTI dataset [1]. The experimental results show that our method is a promising solution to the realtime cross-view geometric localization problem. Our code and dataset are publicly available at the time of publication to facilitate further research.
AB - The cross-view geo-localization problem is to select the aerial-view image of the same geographic location given a ground-view image. Previous researches have focused on the accuracy of matching, which brings the problems of large models, hard to train and to deploy. We propose LightningSpike, a hybrid neural network system with the advantages of lightweight, fast training and inference speed for cross-view geo-localization problem. The system consists of two parts: cross-view image semantic pattern matching and brain-inspired metric learning. In the first part, the reconstruction and semantic segmentation network is applied to generate 3D semantic scene from the monocular ground-view image and get the top view of reconstructed scene. Besides, the aerial-view image is segmented to get the semantic image. We propose a module named 3 Dimensions Dynamic Similarity Matching (3DDSM) to match (shift, rotate and crop) semantic patterns of reconstructed top-view and aerial-view images to align the information between them. In the second part, we propose a lightweight Spiking Vision Multilayer Perceptron (SVMLP) to further learn metric based on matched semantic images. The proposed method is verified on the CV-KITTI dataset, a new dataset we build via randomly sampled from the KITTI dataset [1]. The experimental results show that our method is a promising solution to the realtime cross-view geometric localization problem. Our code and dataset are publicly available at the time of publication to facilitate further research.
KW - 3D semantic reconstruction
KW - Cross-view geo-localization
KW - Metric learning
KW - Spiking neural network
UR - https://www.scopus.com/pages/publications/85189308228
U2 - 10.1109/CAC59555.2023.10451619
DO - 10.1109/CAC59555.2023.10451619
M3 - 会议稿件
AN - SCOPUS:85189308228
T3 - Proceedings - 2023 China Automation Congress, CAC 2023
SP - 8237
EP - 8242
BT - Proceedings - 2023 China Automation Congress, CAC 2023
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 17 November 2023 through 19 November 2023
ER -