论文标题
驾驶员何时集中精力?基于注意力的驾驶员行为建模通过深度强化学习
When Do Drivers Concentrate? Attention-based Driver Behavior Modeling With Deep Reinforcement Learning
论文作者
论文摘要
驾驶员分散驾驶安全的重大风险。除空间领域外,还需要进行时间不关注的研究。本文旨在找出驾驶员的时间关注分配的模式。在本文中,我们提出了一种参与者 - 批判性方法 - 基于注意的双胞胎延迟确定性的策略梯度(ATD3)算法,以根据观察结果近似驾驶员的行动,并测量驾驶员的注意力分配,以连续到达汽车牢固模型中的连续时间步骤。考虑到反应时间,我们在参与者网络中构建了注意机制,以捕获连续观察的时间依赖性。在评论家网络中,我们采用双重延迟的深层确定性政策梯度算法(TD3)来解决参与者 - 批判算法中持续存在的高估价值估计值。我们对现实世界轨迹数据集进行实验,并表明我们所提出的方法的准确性优于七个基线算法。此外,结果表明,流畅车辆中驾驶员的注意力在以前的观察结果中均匀分布,同时当他们关注最近的观察结果时,相对速度突然降低。这项研究是对驾驶员的暂时关注的首次贡献,并从数据挖掘的角度为运输系统的安全措施提供了科学支持。
Driver distraction a significant risk to driving safety. Apart from spatial domain, research on temporal inattention is also necessary. This paper aims to figure out the pattern of drivers' temporal attention allocation. In this paper, we propose an actor-critic method - Attention-based Twin Delayed Deep Deterministic policy gradient (ATD3) algorithm to approximate a driver' s action according to observations and measure the driver' s attention allocation for consecutive time steps in car-following model. Considering reaction time, we construct the attention mechanism in the actor network to capture temporal dependencies of consecutive observations. In the critic network, we employ Twin Delayed Deep Deterministic policy gradient algorithm (TD3) to address overestimated value estimates persisting in the actor-critic algorithm. We conduct experiments on real-world vehicle trajectory datasets and show that the accuracy of our proposed approach outperforms seven baseline algorithms. Moreover, the results reveal that the attention of the drivers in smooth vehicles is uniformly distributed in previous observations while they keep their attention to recent observations when sudden decreases of relative speeds occur. This study is the first contribution to drivers' temporal attention and provides scientific support for safety measures in transportation systems from the perspective of data mining.