通过多代理深入学习和V2X沟通优化交通信号灯

论文标题

通过多代理深入学习和V2X沟通优化交通信号灯

Optimizing Traffic Lights with Multi-agent Deep Reinforcement Learning and V2X communication

论文作者

Hussain, Azhar, Wang, Tong, Jiahua, Cao

论文摘要

我们考虑使用多代理深入增强学习和车辆到所有通信的系统来优化交通信号的持续时间。该系统旨在分析多代理的独立和共享奖励，以控制交通信号灯的持续时间。学习代理交通信号灯在圆形V2X覆盖范围内获取沿其车道的信息。交通信号灯的持续时间周期被建模为马尔可夫决策过程。我们研究了奖励功能的四种变体。前两个是未共享的奖励：基于等待号，以及两个交通信号灯周期之间的车辆的等待时间。第三和第四功能是：基于等待汽车的共享奖励，以及所有代理商的等待时间。每个代理都有通过目标网络进行优化的内存，并优先考虑经验重播。我们通过模拟城市流动性（SUMO）模拟器来评估多代理。与传统的周期性交通控制系统相比，结果证明了拟议系统优化交通信号并将平均等候汽车降低到41.5％的有效性。

We consider a system to optimize duration of traffic signals using multi-agent deep reinforcement learning and Vehicle-to-Everything (V2X) communication. This system aims at analyzing independent and shared rewards for multi-agents to control duration of traffic lights. A learning agent traffic light gets information along its lanes within a circular V2X coverage. The duration cycles of traffic light are modeled as Markov decision Processes. We investigate four variations of reward functions. The first two are unshared-rewards: based on waiting number, and waiting time of vehicles between two cycles of traffic light. The third and fourth functions are: shared-rewards based on waiting cars, and waiting time for all agents. Each agent has a memory for optimization through target network and prioritized experience replay. We evaluate multi-agents through the Simulation of Urban MObility (SUMO) simulator. The results prove effectiveness of the proposed system to optimize traffic signals and reduce average waiting cars to 41.5 % as compared to the traditional periodic traffic control system.

下载PDF全文

下载文献需遵守相关版权规定

论文标题