Skip to main navigation Skip to search Skip to main content

Flexible scene text recognition based on dual attention mechanism

Research output: Contribution to journalArticlepeer-review

2 Scopus citations

Abstract

Scene text recognition (STR) is a very popular topic in the field of computer vision, which can extract text from complex natural scenes. In this article, we propose an end-to-end trainable and flexible STR method based on a dual attention mechanism. The proposed method consists of four modules: a thin plate spline transformer for normalizing the original image, a Channel-Att feature extractor for obtaining representative features, a bidirectional long short-term memory encoder for encoding sequential context features, and a Self-Att based decoder for predicting text labels. The results on seven different benchmark datasets IIIT, SVT, IC03, IC13, IC15, SVTP, and CUTE, show that the proposed method is comparable to 13 existing methods. Especially, the average text recognition accuracy of the proposed method is about 1.4% higher than the state-of-the-art method.

Original languageEnglish
Article numbere5863
JournalConcurrency and Computation: Practice and Experience
Volume33
Issue number22
DOIs
StatePublished - 25 Nov 2021

Keywords

  • channel attention
  • scene text recognition
  • self-attention

Fingerprint

Dive into the research topics of 'Flexible scene text recognition based on dual attention mechanism'. Together they form a unique fingerprint.

Cite this