Skip to main navigation Skip to search Skip to main content

Action Recognition and Benchmark Using Event Cameras

  • Yue Gao
  • , Jiaxuan Lu
  • , Siqi Li
  • , Nan Ma
  • , Shaoyi Du
  • , Yipeng Li
  • , Qionghai Dai
  • Tsinghua University
  • Beijing University of Technology

Research output: Contribution to journalArticlepeer-review

40 Scopus citations

Abstract

Recent years have witnessed remarkable achievements in video-based action recognition. Apart from traditional frame-based cameras, event cameras are bio-inspired vision sensors that only record pixel-wise brightness changes rather than the brightness value. However, little effort has been made in event-based action recognition, and large-scale public datasets are also nearly unavailable. In this paper, we propose an event-based action recognition framework called EV-ACT. The Learnable Multi-Fused Representation (LMFR) is first proposed to integrate multiple event information in a learnable manner. The LMFR with dual temporal granularity is fed into the event-based slow-fast network for the fusion of appearance and motion features. A spatial-temporal attention mechanism is introduced to further enhance the learning capability of action recognition. To prompt research in this direction, we have collected the largest event-based action recognition benchmark named THUE-ACT-50 and the accompanying THUE-ACT-50-CHL dataset under challenging environments, including a total of over 12,830 recordings from 50 action categories, which is over 4 times the size of the previous largest dataset. Experimental results show that our proposed framework could achieve improvements of over 14.5%, 7.6%, 11.2%, and 7.4% compared to previous works on four benchmarks. We have also deployed our proposed EV-ACT framework on a mobile platform to validate its practicality and efficiency.

Original languageEnglish
Pages (from-to)14081-14097
Number of pages17
JournalIEEE Transactions on Pattern Analysis and Machine Intelligence
Volume45
Issue number12
DOIs
StatePublished - 1 Dec 2023

Keywords

  • Action recognition
  • dynamic vision sensor
  • event camera
  • event representation

Fingerprint

Dive into the research topics of 'Action Recognition and Benchmark Using Event Cameras'. Together they form a unique fingerprint.

Cite this