Scene Attention Mechanism for Remote Sensing Image Caption Generation

  • Shiqi Wu
  • , Xiangrong Zhang
  • , Xin Wang
  • , Chen Li
  • , Licheng Jiao

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

32 Scopus citations

Abstract

Remote sensing images play an important role in various applications. To make it easier for humans to understand remote sensing images, the task of remote sensing image captioning attracts more and more researchers' attention. Inspired from the way human receives visual information, attention mechanism has been widely used in remote sensing image understanding. To catch more scene information and improve the stability of the generated sentences, a new attention mechanism called scene attention is proposed. Except for the current attention via the current hidden state of the long shortterm memory network (LSTM), our proposed method simultaneously explores the global visual information from the mean feature of all convolutional features. The effectiveness of the proposed method is evaluated on UCM-captions, Sydney-captions and RSICD datasets. The results of our experiment show that comparing with some other captioning methods, our method is more stable and obtains a better performance.

Original languageEnglish
Title of host publication2020 International Joint Conference on Neural Networks, IJCNN 2020 - Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9781728169262
DOIs
StatePublished - Jul 2020
Externally publishedYes
Event2020 International Joint Conference on Neural Networks, IJCNN 2020 - Virtual, Glasgow, United Kingdom
Duration: 19 Jul 202024 Jul 2020

Publication series

NameProceedings of the International Joint Conference on Neural Networks

Conference

Conference2020 International Joint Conference on Neural Networks, IJCNN 2020
Country/TerritoryUnited Kingdom
CityVirtual, Glasgow
Period19/07/2024/07/20

Keywords

  • convolutional neural network
  • long short-term memory network
  • remote sensing image captioning
  • scene attention mechanism

Fingerprint

Dive into the research topics of 'Scene Attention Mechanism for Remote Sensing Image Caption Generation'. Together they form a unique fingerprint.

Cite this