TY - GEN
T1 - Atten-ganCV
T2 - 2023 China Automation Congress, CAC 2023
AU - Song, Wenjie
AU - Xue, Jianru
N1 - Publisher Copyright:
© 2023 IEEE.
PY - 2023
Y1 - 2023
N2 - Cross-view geo-localization aims to match the query input ground-view image and the aerial-view images in the reference dataset one by one to determine the ground image's geographic location. This research is extremely challenging because the variation of the observation angle between cross-view images brings about great geometric appearance differences between image pairs. Nowadays, the introduction of generative networks into matching models has been shown to work well on the CVUSA (Cross-View USA) dataset, and the latest models clarify the paradigm of end-to-end generative cross-view image matching methods. However, this result relies on an assumption on the dataset: for all query input ground images, there must exist a reference aerial image that is exactly centered on the location of that image, which is clearly not consistent with real-world application scenarios; and the performance of state-of-the-art generative models degrades significantly when departing from this assumption of center alignment. To address this problem, this paper provides a generative model (atten-ganCV) for non-center-aligned datasets. This model feeds the query ground image directly into a generative adversarial network to obtain a generated aerial view image, where the generator atten-UNet innovatively introduces an attention mechanism. Then, model matches the synthesized image with the real aerial image in the reference dataset one by one, and finally obtains the matching result with the highest similarity, thus determining the geographic location of the query input. The model is tested on both the center-aligned CVUSA dataset and the non-center-aligned VIGOR (Cross-view Image Geo-localization beyond One-to-one Retrieval) dataset. In the VIGOR dataset, this model achieves approximately the same accuracy as the state-of-the-art model with 3 times the inference speed.
AB - Cross-view geo-localization aims to match the query input ground-view image and the aerial-view images in the reference dataset one by one to determine the ground image's geographic location. This research is extremely challenging because the variation of the observation angle between cross-view images brings about great geometric appearance differences between image pairs. Nowadays, the introduction of generative networks into matching models has been shown to work well on the CVUSA (Cross-View USA) dataset, and the latest models clarify the paradigm of end-to-end generative cross-view image matching methods. However, this result relies on an assumption on the dataset: for all query input ground images, there must exist a reference aerial image that is exactly centered on the location of that image, which is clearly not consistent with real-world application scenarios; and the performance of state-of-the-art generative models degrades significantly when departing from this assumption of center alignment. To address this problem, this paper provides a generative model (atten-ganCV) for non-center-aligned datasets. This model feeds the query ground image directly into a generative adversarial network to obtain a generated aerial view image, where the generator atten-UNet innovatively introduces an attention mechanism. Then, model matches the synthesized image with the real aerial image in the reference dataset one by one, and finally obtains the matching result with the highest similarity, thus determining the geographic location of the query input. The model is tested on both the center-aligned CVUSA dataset and the non-center-aligned VIGOR (Cross-view Image Geo-localization beyond One-to-one Retrieval) dataset. In the VIGOR dataset, this model achieves approximately the same accuracy as the state-of-the-art model with 3 times the inference speed.
KW - Cross-view geo-localization
KW - attention aggregation mechanism
KW - generative adver-sarial network
UR - https://www.scopus.com/pages/publications/85189324943
U2 - 10.1109/CAC59555.2023.10452075
DO - 10.1109/CAC59555.2023.10452075
M3 - 会议稿件
AN - SCOPUS:85189324943
T3 - Proceedings - 2023 China Automation Congress, CAC 2023
SP - 8920
EP - 8925
BT - Proceedings - 2023 China Automation Congress, CAC 2023
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 17 November 2023 through 19 November 2023
ER -