论文标题
改进了使用特定领域的人工智能用于开发儿科的数字疗法:机器学习研究
Improved Digital Therapy for Developmental Pediatrics Using Domain-Specific Artificial Intelligence: Machine Learning Study
论文作者
论文摘要
背景:自动化的情绪分类可以帮助那些努力识别情绪的人,包括具有自闭症等发育行为状况的儿童。但是,大多数计算机视觉情感识别模型都受到成人情绪的训练,因此在应用于儿童面孔时表现不佳。目的:我们设计了一种策略,可以游戏富含儿童情感的图像的收集和标记,以使自动儿童情感识别模型的性能更接近数字医疗保健方法所需的水平。方法:我们利用了原型治疗性智能手机游戏,猜猜是为具有发育和行为条件的儿童而设计的,以游戏为游戏的安全收集,这些儿童的视频数据表达了该游戏引起的各种情感。独立地,我们创建了一个安全的Web界面,以游戏化人类的标签工作,称为HollywoodSquares,该工作量是由任何合格的标签量身定制的。我们在所有图像上收集并标记了2155个视频,39,968个情感框架和106,001个标签。凭借这种大幅扩展的小儿情绪以中心的数据库(> 30倍(> 30倍),我们培训了卷积神经网络(CNN)的计算机视觉分类器的幸福,悲伤,惊讶,恐惧,恐惧,愤怒,厌恶,和中性表达的孩子。结果:分类器在整个儿童情感面部表情(CAFE)上达到了66.9%的平衡精度和67.4%的F1评分,以及79.1%的平衡精度和78%的F1 cafe子集A,这是一个至少包含与情绪标签的60%人类协议的子集。这种性能至少比所有针对CAFE评估的先前开发的分类器高10%,即使将“愤怒”和“厌恶”结合到单个类别时,最好的表现也达到了56%的平衡精度。
Background: Automated emotion classification could aid those who struggle to recognize emotions, including children with developmental behavioral conditions such as autism. However, most computer vision emotion recognition models are trained on adult emotion and therefore underperform when applied to child faces. Objective: We designed a strategy to gamify the collection and labeling of child emotion-enriched images to boost the performance of automatic child emotion recognition models to a level closer to what will be needed for digital health care approaches. Methods: We leveraged our prototype therapeutic smartphone game, GuessWhat, which was designed in large part for children with developmental and behavioral conditions, to gamify the secure collection of video data of children expressing a variety of emotions prompted by the game. Independently, we created a secure web interface to gamify the human labeling effort, called HollywoodSquares, tailored for use by any qualified labeler. We gathered and labeled 2155 videos, 39,968 emotion frames, and 106,001 labels on all images. With this drastically expanded pediatric emotion-centric database (>30 times larger than existing public pediatric emotion data sets), we trained a convolutional neural network (CNN) computer vision classifier of happy, sad, surprised, fearful, angry, disgust, and neutral expressions evoked by children. Results: The classifier achieved a 66.9% balanced accuracy and 67.4% F1-score on the entirety of the Child Affective Facial Expression (CAFE) as well as a 79.1% balanced accuracy and 78% F1-score on CAFE Subset A, a subset containing at least 60% human agreement on emotions labels. This performance is at least 10% higher than all previously developed classifiers evaluated against CAFE, the best of which reached a 56% balanced accuracy even when combining "anger" and "disgust" into a single class.