TY - GEN
T1 - Human action detection by boosting efficient motion features
AU - Yang, Ming
AU - Lv, Fengjun
AU - Xu, Wei
AU - Yu, Kai
AU - Gong, Yihong
PY - 2009
Y1 - 2009
N2 - Recent years have witnessed significant progress in detection of basic human actions. However, most existing methods rely on assumptions such as known spatial locations and temporal segmentations or employ very computationally expensive approaches such as sliding window search through a spatio-temporal volume. It is difficult for such methods to scale up to handle the challenges in real applications such as video surveillance. In this paper, we present an efficient and practical approach to detecting basic human actions, such as making cell phone calls, putting down objects, and hand-pointing, which has been extensively tested on the challenging 2008 TRECVID surveillance event detection dataset . We propose a novel action representation scheme using a set of motion edge history images, which not only encodes both shape and motion patterns of actions without relying on precise alignment of human figures, but also facilitates learning of fast tree-structured boosting classifiers. Our approach is robust with respect to cluttered background as well as scale and viewpoint changes. It is also computationally efficient by taking advantage of human detection and tracking to reduce the searching space. We demonstrate promising results on the 50-hour TRECVID development set as well as two other widely-used benchmark datasets of action recognition, i.e. the KTH dataset and the Weizmann dataset.
AB - Recent years have witnessed significant progress in detection of basic human actions. However, most existing methods rely on assumptions such as known spatial locations and temporal segmentations or employ very computationally expensive approaches such as sliding window search through a spatio-temporal volume. It is difficult for such methods to scale up to handle the challenges in real applications such as video surveillance. In this paper, we present an efficient and practical approach to detecting basic human actions, such as making cell phone calls, putting down objects, and hand-pointing, which has been extensively tested on the challenging 2008 TRECVID surveillance event detection dataset . We propose a novel action representation scheme using a set of motion edge history images, which not only encodes both shape and motion patterns of actions without relying on precise alignment of human figures, but also facilitates learning of fast tree-structured boosting classifiers. Our approach is robust with respect to cluttered background as well as scale and viewpoint changes. It is also computationally efficient by taking advantage of human detection and tracking to reduce the searching space. We demonstrate promising results on the 50-hour TRECVID development set as well as two other widely-used benchmark datasets of action recognition, i.e. the KTH dataset and the Weizmann dataset.
UR - https://www.scopus.com/pages/publications/77953193985
U2 - 10.1109/ICCVW.2009.5457656
DO - 10.1109/ICCVW.2009.5457656
M3 - 会议稿件
AN - SCOPUS:77953193985
SN - 9781424444427
T3 - 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops 2009
SP - 522
EP - 529
BT - 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops 2009
PB - IEEE Computer Society
T2 - 12th IEEE International Conference on Computer Vision Workshops, ICCVW 2009
Y2 - 27 September 2009 through 4 October 2009
ER -