使用肌肉动作和嵌入式嵌入视频中的（隐藏）情绪检测（隐藏的）情绪

论文标题

使用肌肉动作和嵌入式嵌入视频中的（隐藏）情绪检测（隐藏的）情绪

Detection of (Hidden) Emotions from Videos using Muscles Movements and Face Manifold Embedding

论文作者

Kim, Juni, Dong, Zhikang, Guan, Eric, Rosenthal, Judah, Fu, Shi, Rafailovich, Miriam, Polak, Pawel

论文摘要

我们为大量受试者提供了一种新的非侵入性，易于尺度，以及一种从人脸视频中检测（隐藏的）情感检测的远程访问方法。我们的方法结合了面部歧管检测，以使视频中面部的准确位置以及本地面歧管嵌入，以创建一个共同的域，以测量肌肉微动物，这对于视频中对象的运动是不变的。在下一步中，我们采用数字图像斑点相关性（DISC）和光流算法来计算面部微动物的模式。相应的矢量字段映射回原始空间，并叠加在视频的原始框架上。因此，由此产生的视频包括有关肌肉在脸上运动方向的其他信息。我们采用可见情绪的CK ++数据集，并添加相同格式但隐藏情绪的视频。我们使用微动作检测处理所有视频，并使用结果来训练从视频 - 框架注意网络（FAN）的情绪分类的最先进的网络。尽管原始的粉丝模型在原始CK ++视频上取得了很高的样本外观性能，但在隐藏的情感视频上表现不佳。当对模型进行训练并在具有肌肉运动的矢量场的视频上进行训练和测试时，性能会显着提高。从直觉上，相应的箭头充当图像中的边缘，而粉丝网络中的汇集过滤器很容易捕获。

We provide a new non-invasive, easy-to-scale for large amounts of subjects and a remotely accessible method for (hidden) emotion detection from videos of human faces. Our approach combines face manifold detection for accurate location of the face in the video with local face manifold embedding to create a common domain for the measurements of muscle micro-movements that is invariant to the movement of the subject in the video. In the next step, we employ the Digital Image Speckle Correlation (DISC) and the optical flow algorithm to compute the pattern of micro-movements in the face. The corresponding vector field is mapped back to the original space and superimposed on the original frames of the videos. Hence, the resulting videos include additional information about the direction of the movement of the muscles in the face. We take the publicly available CK++ dataset of visible emotions and add to it videos of the same format but with hidden emotions. We process all the videos using our micro-movement detection and use the results to train a state-of-the-art network for emotions classification from videos -- Frame Attention Network (FAN). Although the original FAN model achieves very high out-of-sample performance on the original CK++ videos, it does not perform so well on hidden emotions videos. The performance improves significantly when the model is trained and tested on videos with the vector fields of muscle movements. Intuitively, the corresponding arrows serve as edges in the image that are easily captured by the convolutions filters in the FAN network.

下载PDF全文

下载文献需遵守相关版权规定

论文标题