驾驶眼镜分类：朝着跨域和受试者进行概括

论文标题

驾驶眼镜分类：朝着跨域和受试者进行概括

Driver Glance Classification In-the-wild: Towards Generalization Across Domains and Subjects

论文作者

Banerjee, Sandipan, Joshi, Ajjen, Turcot, Jay, Reimer, Bryan, Mishra, Taniya

论文摘要

分心的驾驶员是危险的驾驶员。为高级驾驶员援助系统（ADA）设备能够检测驾驶员分心的能力可以帮助预防事故并提高驾驶员安全。为了检测驾驶员的注意力，ADA必须能够监视其视觉关注。我们提出了一个模型，该模型将驾驶员脸部的输入和眼睛区域的农作物输入，并将其一眼归类为车辆中的6个粗区域（ROI）。我们证明，经过额外的重建损失训练的沙漏网络使该模型比传统的仅编码分类模块学习更强的上下文特征表示。为了使系统对外观和行为的特定主题变化进行鲁棒性，我们设计了一个个性化的沙漏模型，该模型用代表驾驶员的基线镜头行为调整的辅助输入。最后，我们提出了一种弱监督的多域训练方案，该方案使沙漏可以从不同的域（相机类型，角度不同）共同学习表示形式，并利用未标记的样品，从而降低注释成本。

Distracted drivers are dangerous drivers. Equipping advanced driver assistance systems (ADAS) with the ability to detect driver distraction can help prevent accidents and improve driver safety. In order to detect driver distraction, an ADAS must be able to monitor their visual attention. We propose a model that takes as input a patch of the driver's face along with a crop of the eye-region and classifies their glance into 6 coarse regions-of-interest (ROIs) in the vehicle. We demonstrate that an hourglass network, trained with an additional reconstruction loss, allows the model to learn stronger contextual feature representations than a traditional encoder-only classification module. To make the system robust to subject-specific variations in appearance and behavior, we design a personalized hourglass model tuned with an auxiliary input representing the driver's baseline glance behavior. Finally, we present a weakly supervised multi-domain training regimen that enables the hourglass to jointly learn representations from different domains (varying in camera type, angle), utilizing unlabeled samples and thereby reducing annotation cost.

下载PDF全文

下载文献需遵守相关版权规定

论文标题