Temporal aggregation with clip-level attention for video-based person re-identification

  • Mengliu Li
  • , Han Xu
  • , Jinjun Wang
  • , Wenpeng Li
  • , Yongli Sun

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

14 Scopus citations

Abstract

Video-based person re-identification (Re-ID) methods can extract richer features than image-based ones from short video clips. The existing methods usually apply simple strategies, such as average/max pooling, to obtain the tracklet-level features, which has been proved hard to aggregate the information from all video frames. In this paper, we propose a simple yet effective Temporal Aggregation with Clip-level Attention Network (TACAN) to solve the temporal aggregation problem in a hierarchal way. Specifically, a tracklet is firstly broken into different numbers of clips, through a two-stage temporal aggregation network we can get the tracklet-level feature representation. A novel min-max loss is introduced to learn both a clip-level attention extractor and a clip-level feature representer in the training process. Afterwards, the resulting clip-level weights are further taken to average the clip-level features, which can generate a robust tracklet-level feature representation at the testing stage. Experimental results on four benchmark datasets, including the MARS, iLIDS-VID, PRID-2011 and DukeMTMC-VideoReID, show that our TACAN has achieved significant improvements as compared with the state-of-the-art approaches.

Original languageEnglish
Title of host publicationProceedings - 2020 IEEE Winter Conference on Applications of Computer Vision, WACV 2020
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages3365-3373
Number of pages9
ISBN (Electronic)9781728165530
DOIs
StatePublished - Mar 2020
Event2020 IEEE/CVF Winter Conference on Applications of Computer Vision, WACV 2020 - Snowmass Village, United States
Duration: 1 Mar 20205 Mar 2020

Publication series

NameProceedings - 2020 IEEE Winter Conference on Applications of Computer Vision, WACV 2020

Conference

Conference2020 IEEE/CVF Winter Conference on Applications of Computer Vision, WACV 2020
Country/TerritoryUnited States
CitySnowmass Village
Period1/03/205/03/20

Fingerprint

Dive into the research topics of 'Temporal aggregation with clip-level attention for video-based person re-identification'. Together they form a unique fingerprint.

Cite this