Abstract

Head poses are a key component of human bodily communication and thus a decisive element of human-computer interaction. Real-time head pose estimation is crucial in the context of human-robot interaction or driver assistance systems. The most promising approaches for head pose estimation are based on Convolutional Neural Networks (CNNs). However, CNN models are often too complex to achieve real-time performance. To face this challenge, we explore a popular subgroup of CNNs, the Residual Networks (ResNets) and modify them in order to reduce their number of parameters. The ResNets are modifed for different image sizes including low-resolution images and combined with a varying number of layers. They are trained on in-the-wild datasets to ensure real-world applicability. As a result, we demonstrate that the performance of the ResNets can be maintained while reducing the number of parameters. The modified ResNets achieve state-of-the-art accuracy and provide fast inference for real-time applicability.


Original document

The different versions of the original document can be found in:

http://dx.doi.org/10.1007/978-3-030-22999-3_12 under the license http://www.springer.com/tdm
https://arxiv.org/pdf/1906.05203,
https://arxiv.org/abs/1906.05203,
https://ui.adsabs.harvard.edu/abs/2019arXiv190605203R/abstract,
http://export.arxiv.org/pdf/1906.05203,
http://export.arxiv.org/abs/1906.05203,
https://academic.microsoft.com/#/detail/2953397651
Back to Top

Document information

Published on 01/01/2019

Volume 2019, 2019
DOI: 10.1007/978-3-030-22999-3_12
Licence: Other

Document Score

0

Views 0
Recommendations 0

Share this document

claim authorship

Are you one of the authors of this document?