A dual-stream spatial-temporal detector for action recognition

  • Pingping Wei
  • , Jiale Li
  • , Peiran Liu
  • , Li Li
  • , Yifei Xu
  • , Ling Wang

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

In response to the issue that existing human action recognition models can not make full use of complementary information from different modalities, this thesis proposes a multi-path attention module MA to form the MA-GCN model. modalities, this thesis proposes a dual-stream human action recognition model SRHAR that fuses skeleton data and RGB data. This model utilizes LAF proposed in this thesis to fuse skeleton features and RGB features. The introduction of skeleton modality enables the RGB modality to obtain the RGB features. The introduction of skeleton modality enables the RGB modality to obtain complementary information, resulting in more accurate prediction results. This algorithm focuses more on recognition accuracy and has a slower recognition speed, but achieves leading performance in terms of accuracy on public datasets.

Original languageEnglish
Title of host publicationFourth International Conference on Computer Vision and Pattern Analysis, ICCPA 2024
EditorsJi Zhao, Yonghui Yang
PublisherSPIE
ISBN (Electronic)9781510682528
DOIs
StatePublished - 2024
Event4th International Conference on Computer Vision and Pattern Analysis, ICCPA 2024 - Anshan, China
Duration: 17 May 202419 May 2024

Publication series

NameProceedings of SPIE - The International Society for Optical Engineering
Volume13256
ISSN (Print)0277-786X
ISSN (Electronic)1996-756X

Conference

Conference4th International Conference on Computer Vision and Pattern Analysis, ICCPA 2024
Country/TerritoryChina
CityAnshan
Period17/05/2419/05/24

Keywords

  • action recognition
  • attention mechanism
  • lightweight

Fingerprint

Dive into the research topics of 'A dual-stream spatial-temporal detector for action recognition'. Together they form a unique fingerprint.

Cite this