Abstract
Learning discriminative features plays a significant role in action recognition. Many attempts have been made to train deep neural networks by their labeled data. However, in previous networks, the view or distance variations can cause the intra-class differences even larger than inter-class differences. In this work, we propose a new contrastive self-supervised learning method for action recognition of unlabeled skeletal videos. Through contrastive representation learning by adequate compositions of viewpoints and distances, the self-supervised net selects discriminative features which have invariance motion semantics for action recognition. We hope this attempt can be helpful for the unsupervised learning study of skeleton-based action recognition.
| Original language | English |
|---|---|
| Pages (from-to) | 51-61 |
| Number of pages | 11 |
| Journal | Proceedings of Machine Learning Research |
| Volume | 148 |
| State | Published - 2021 |
| Event | NeurIPS 2020 Workshop on Pre-Registration in Machine Learning - Virtual, Online Duration: 11 Dec 2020 → … |