TY - GEN
T1 - Scene Attention Mechanism for Remote Sensing Image Caption Generation
AU - Wu, Shiqi
AU - Zhang, Xiangrong
AU - Wang, Xin
AU - Li, Chen
AU - Jiao, Licheng
N1 - Publisher Copyright:
© 2020 IEEE.
PY - 2020/7
Y1 - 2020/7
N2 - Remote sensing images play an important role in various applications. To make it easier for humans to understand remote sensing images, the task of remote sensing image captioning attracts more and more researchers' attention. Inspired from the way human receives visual information, attention mechanism has been widely used in remote sensing image understanding. To catch more scene information and improve the stability of the generated sentences, a new attention mechanism called scene attention is proposed. Except for the current attention via the current hidden state of the long shortterm memory network (LSTM), our proposed method simultaneously explores the global visual information from the mean feature of all convolutional features. The effectiveness of the proposed method is evaluated on UCM-captions, Sydney-captions and RSICD datasets. The results of our experiment show that comparing with some other captioning methods, our method is more stable and obtains a better performance.
AB - Remote sensing images play an important role in various applications. To make it easier for humans to understand remote sensing images, the task of remote sensing image captioning attracts more and more researchers' attention. Inspired from the way human receives visual information, attention mechanism has been widely used in remote sensing image understanding. To catch more scene information and improve the stability of the generated sentences, a new attention mechanism called scene attention is proposed. Except for the current attention via the current hidden state of the long shortterm memory network (LSTM), our proposed method simultaneously explores the global visual information from the mean feature of all convolutional features. The effectiveness of the proposed method is evaluated on UCM-captions, Sydney-captions and RSICD datasets. The results of our experiment show that comparing with some other captioning methods, our method is more stable and obtains a better performance.
KW - convolutional neural network
KW - long short-term memory network
KW - remote sensing image captioning
KW - scene attention mechanism
UR - https://www.scopus.com/pages/publications/85093828584
U2 - 10.1109/IJCNN48605.2020.9207381
DO - 10.1109/IJCNN48605.2020.9207381
M3 - 会议稿件
AN - SCOPUS:85093828584
T3 - Proceedings of the International Joint Conference on Neural Networks
BT - 2020 International Joint Conference on Neural Networks, IJCNN 2020 - Proceedings
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2020 International Joint Conference on Neural Networks, IJCNN 2020
Y2 - 19 July 2020 through 24 July 2020
ER -