面部动作单元检测的上，中和下区学习

论文标题

面部动作单元检测的上，中和下区学习

Upper, Middle and Lower Region Learning for Facial Action Unit Detection

论文作者

Xia, Yao

论文摘要

面部作用单元（AUS）检测是面部表达分析的基础。由于AU仅发生在面部的一小部分中，因此基于区域的学习已被广泛认可，可用于AU检测。大多数基于地区的研究都集中在AU发生的小区域。专注于特定区域有助于消除身份的影响，但会带来丢失信息的风险。找到平衡是一项挑战。在这项研究中，我提出了一个简单的策略。我将面部分为三个宽区域，即上，中和下部区域，以及基于澳大利亚的aus群。我提出了一个新的端到端深度学习框架，名为基于三个区域的注意网络（TRA-NET）。提取全局特征后，Tra-Net使用硬注意模块提取三个特征图，每个图仅包含一个特定区域。每个特定区域特征图被馈送到一个独立的分支。对于每个分支，三个连续的软注意模块用于提取最终AU检测的高级特征。在DISFA数据集中，该模型获得了检测AU1，AU2和AU4的最高F1分数，并且与最先进的方法相比，它的准确性最高。

Facial action units (AUs) detection is fundamental to facial expression analysis. As AU occurs only in a small area of the face, region-based learning has been widely recognized useful for AU detection. Most region-based studies focus on a small region where the AU occurs. Focusing on a specific region helps eliminate the influence of identity, but bringing a risk for losing information. It is challenging to find balance. In this study, I propose a simple strategy. I divide the face into three broad regions, upper, middle, and lower region, and group AUs based on where it occurs. I propose a new end-to-end deep learning framework named three regions based attention network (TRA-Net). After extracting the global feature, TRA-Net uses a hard attention module to extract three feature maps, each of which contains only a specific region. Each region-specific feature map is fed to an independent branch. For each branch, three continuous soft attention modules are used to extract higher-level features for final AU detection. In the DISFA dataset, this model achieves the highest F1 scores for the detection of AU1, AU2, and AU4, and produces the highest accuracy in comparison with the state-of-the-art methods.

下载PDF全文

下载文献需遵守相关版权规定

论文标题