TY - GEN
T1 - Action Coherence Network for Weakly Supervised Temporal Action Localization
AU - Zhai, Yuanhao
AU - Wang, Le
AU - Liu, Ziyi
AU - Zhang, Qilin
AU - Hua, Gang
AU - Zheng, Nanning
N1 - Publisher Copyright:
© 2019 IEEE.
PY - 2019/9
Y1 - 2019/9
N2 - Most prominent temporal action localization methods are of the fully-supervised type, which rely heavily on frame-level labels, which could be prohibitively expensive to annotate. Thanks to recent developments on the Weakly-supervised Temporal Action Localization (W-TAL), this alternative paradigm requires only video-level labels in training, alleviating such annotation efforts. Specifically, we present Action Coherence Network (ACN) for W-TAL, which features a new coherence loss that better supervises action boundary learning and facilitate proposal regression. In addition, a purpose-built fusion module is proposed for localization inference based on features extracted by two streams of convolutional neural network. Overall, the proposed ACN achieves state-of-the-art W-TAL performance on two challenging datasets (THU-MOS14 and ActivityNet1.2, particularly ACN attains mAP of 24.2% on THUMOS14 under IoU threshold 0.5), which is approaching some recent fully-supervised TAL methods.
AB - Most prominent temporal action localization methods are of the fully-supervised type, which rely heavily on frame-level labels, which could be prohibitively expensive to annotate. Thanks to recent developments on the Weakly-supervised Temporal Action Localization (W-TAL), this alternative paradigm requires only video-level labels in training, alleviating such annotation efforts. Specifically, we present Action Coherence Network (ACN) for W-TAL, which features a new coherence loss that better supervises action boundary learning and facilitate proposal regression. In addition, a purpose-built fusion module is proposed for localization inference based on features extracted by two streams of convolutional neural network. Overall, the proposed ACN achieves state-of-the-art W-TAL performance on two challenging datasets (THU-MOS14 and ActivityNet1.2, particularly ACN attains mAP of 24.2% on THUMOS14 under IoU threshold 0.5), which is approaching some recent fully-supervised TAL methods.
KW - coherence loss
KW - temporal action lo-calization
KW - weakly-supervised
UR - https://www.scopus.com/pages/publications/85076809593
U2 - 10.1109/ICIP.2019.8803447
DO - 10.1109/ICIP.2019.8803447
M3 - 会议稿件
AN - SCOPUS:85076809593
T3 - Proceedings - International Conference on Image Processing, ICIP
SP - 3696
EP - 3700
BT - 2019 IEEE International Conference on Image Processing, ICIP 2019 - Proceedings
PB - IEEE Computer Society
T2 - 26th IEEE International Conference on Image Processing, ICIP 2019
Y2 - 22 September 2019 through 25 September 2019
ER -