多任务头部姿势估计

论文标题

多任务头部姿势估计

Multi-task head pose estimation in-the-wild

论文作者

Valle, Roberto, Buenaposada, José Miguel, Baumela, Luis

论文摘要

我们提出了一种基于深度学习的多任务方法，用于图像中的头部姿势估计。我们采用网络体系结构和培训策略做出了贡献，该策略可以利用面孔，对齐和可见性之间的强大依赖性，以为所有三个任务生成最高的性能模型。我们的体系结构是具有残留块和横向跳动连接的编码器Decoder CNN。我们表明，头部姿势估计和基于具有里程碑意义的面部对齐的结合显着改善了前者任务的性能。此外，在瓶颈层，编码器末尾的姿势任务以及根据空间信息（例如可见性和对齐方式）在最终解码器层中的任务的位置也有助于提高最终性能。在进行的实验中，提出的模型的表现优于面部姿势和可见性任务的最先进。通过包括最终的地标回归步骤，它还可以与最先进的面对面的结果相吻合。

We present a deep learning-based multi-task approach for head pose estimation in images. We contribute with a network architecture and training strategy that harness the strong dependencies among face pose, alignment and visibility, to produce a top performing model for all three tasks. Our architecture is an encoder-decoder CNN with residual blocks and lateral skip connections. We show that the combination of head pose estimation and landmark-based face alignment significantly improve the performance of the former task. Further, the location of the pose task at the bottleneck layer, at the end of the encoder, and that of tasks depending on spatial information, such as visibility and alignment, in the final decoder layer, also contribute to increase the final performance. In the experiments conducted the proposed model outperforms the state-of-the-art in the face pose and visibility tasks. By including a final landmark regression step it also produces face alignment results on par with the state-of-the-art.

下载PDF全文

下载文献需遵守相关版权规定

论文标题