Atten-ganCV: An End-to-End Close-Coupled Image-Generating Cross-View Network

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Cross-view geo-localization aims to match the query input ground-view image and the aerial-view images in the reference dataset one by one to determine the ground image's geographic location. This research is extremely challenging because the variation of the observation angle between cross-view images brings about great geometric appearance differences between image pairs. Nowadays, the introduction of generative networks into matching models has been shown to work well on the CVUSA (Cross-View USA) dataset, and the latest models clarify the paradigm of end-to-end generative cross-view image matching methods. However, this result relies on an assumption on the dataset: for all query input ground images, there must exist a reference aerial image that is exactly centered on the location of that image, which is clearly not consistent with real-world application scenarios; and the performance of state-of-the-art generative models degrades significantly when departing from this assumption of center alignment. To address this problem, this paper provides a generative model (atten-ganCV) for non-center-aligned datasets. This model feeds the query ground image directly into a generative adversarial network to obtain a generated aerial view image, where the generator atten-UNet innovatively introduces an attention mechanism. Then, model matches the synthesized image with the real aerial image in the reference dataset one by one, and finally obtains the matching result with the highest similarity, thus determining the geographic location of the query input. The model is tested on both the center-aligned CVUSA dataset and the non-center-aligned VIGOR (Cross-view Image Geo-localization beyond One-to-one Retrieval) dataset. In the VIGOR dataset, this model achieves approximately the same accuracy as the state-of-the-art model with 3 times the inference speed.

Original languageEnglish
Title of host publicationProceedings - 2023 China Automation Congress, CAC 2023
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages8920-8925
Number of pages6
ISBN (Electronic)9798350303759
DOIs
StatePublished - 2023
Event2023 China Automation Congress, CAC 2023 - Chongqing, China
Duration: 17 Nov 202319 Nov 2023

Publication series

NameProceedings - 2023 China Automation Congress, CAC 2023

Conference

Conference2023 China Automation Congress, CAC 2023
Country/TerritoryChina
CityChongqing
Period17/11/2319/11/23

Keywords

  • Cross-view geo-localization
  • attention aggregation mechanism
  • generative adver-sarial network

Fingerprint

Dive into the research topics of 'Atten-ganCV: An End-to-End Close-Coupled Image-Generating Cross-View Network'. Together they form a unique fingerprint.

Cite this