Deep conditional variational estimation for depth-based hand poses

  • Lu Xu
  • , Chen Hu
  • , Yinqi Li
  • , Ji'an Tao
  • , Jianru Xue
  • , Kuizhi Mei

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

2 Scopus citations

Abstract

We propose a novel and effective approach for 3D hand pose estimation on single depth image. Instead of doing deterministic regression from depth images, our model focuses on learning a latent distribution to model the high dimensional space of pose joints, which can also be interpreted as a kinematics model for human hands. Specifically, the proposed network combines the framework of conditional variational autoencoder which learns an encoder and a decoder with standard convolutional network. The encoder models the latent variable as a prior or a regularization for the pose joints. Then probabilistic inference is performed by the decoder to generate the output prediction conditioned on input depth images. In addition, we introduce a pool-convolution module to improve the localization regression of the network. The architecture can be trained end-to-end. In experiments, we demonstrate the effectiveness of our proposed approach in comparison to various state-of-art holistic regression approaches.

Original languageEnglish
Title of host publicationProceedings - 14th IEEE International Conference on Automatic Face and Gesture Recognition, FG 2019
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9781728100890
DOIs
StatePublished - May 2019
Event14th IEEE International Conference on Automatic Face and Gesture Recognition, FG 2019 - Lille, France
Duration: 14 May 201918 May 2019

Publication series

NameProceedings - 14th IEEE International Conference on Automatic Face and Gesture Recognition, FG 2019

Conference

Conference14th IEEE International Conference on Automatic Face and Gesture Recognition, FG 2019
Country/TerritoryFrance
CityLille
Period14/05/1918/05/19

Fingerprint

Dive into the research topics of 'Deep conditional variational estimation for depth-based hand poses'. Together they form a unique fingerprint.

Cite this