TY - GEN
T1 - Boosting Few-Shot Remote Sensing Image Scene Classification with Language-Guided Multimodal Prompt Tuning
AU - Bi, Haixia
AU - Gao, Zhangwei
AU - Liu, Kang
AU - Song, Qian
AU - Wang, Xiaotian
N1 - Publisher Copyright:
© 2023 IEEE.
PY - 2023
Y1 - 2023
N2 - Remote sensing image Scene classification is an important research topic in remote sensing community and has evoked a growing concern with the recent development of deep learning techniques. However, the requirement of a large amount of annotations brings great challenges to deep learning-based scene classification approaches. Visual-linguistic pretraining models, which improve the transferability of visual models using the supervision information of text, create a new way for the task under label scarcity scenario. In this paper, we explore the novel approach of prompt engineering, aiming to achieve satisfactory performance of multi-modal pretraining models on downstream remote sensing image scene classification task with minimal amounts of training data. Experiments were conducted on multiple publicly available datasets. The results indicate that training the learnable prompts with a small number of samples can yield impressive results, surpassing the few-shot transfer learning results of the best-performing pre-trained models.
AB - Remote sensing image Scene classification is an important research topic in remote sensing community and has evoked a growing concern with the recent development of deep learning techniques. However, the requirement of a large amount of annotations brings great challenges to deep learning-based scene classification approaches. Visual-linguistic pretraining models, which improve the transferability of visual models using the supervision information of text, create a new way for the task under label scarcity scenario. In this paper, we explore the novel approach of prompt engineering, aiming to achieve satisfactory performance of multi-modal pretraining models on downstream remote sensing image scene classification task with minimal amounts of training data. Experiments were conducted on multiple publicly available datasets. The results indicate that training the learnable prompts with a small number of samples can yield impressive results, surpassing the few-shot transfer learning results of the best-performing pre-trained models.
KW - Few-shot learning
KW - Multi-modal pretraining
KW - Prompt tuning
KW - Remote sensing image scene classification
UR - https://www.scopus.com/pages/publications/85185221040
U2 - 10.1109/NTCI60157.2023.10403750
DO - 10.1109/NTCI60157.2023.10403750
M3 - 会议稿件
AN - SCOPUS:85185221040
T3 - Proceedings of 2023 International Conference on New Trends in Computational Intelligence, NTCI 2023
SP - 293
EP - 297
BT - Proceedings of 2023 International Conference on New Trends in Computational Intelligence, NTCI 2023
A2 - Wang, Jian
A2 - Polycarpou, Marios M.
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2023 International Conference on New Trends in Computational Intelligence, NTCI 2023
Y2 - 3 November 2023 through 5 November 2023
ER -