论文标题
检测32个自动驾驶汽车的行人属性
Detecting 32 Pedestrian Attributes for Autonomous Vehicles
论文作者
论文摘要
行人可以说是城市地区自动驾驶汽车的最安全关键道路使用者之一。在本文中,我们解决了共同检测行人并从单个图像中识别出32个行人属性的问题。这些涵盖了视觉外观和行为,还包括对道路交叉的预测,这是一个主要的安全问题。为此,我们引入了依靠复合现场框架的多任务学习(MTL)模型,该框架以有效的方式实现了这两个目标。每个字段在空间上都定位行人实例,并在其上汇总属性预测。这种表述自然利用空间环境,使其非常适合低分辨率方案,例如自动驾驶。通过增加共同学习的属性的数量,我们强调了与梯度尺度有关的问题,该问题在MTL中带有许多任务。我们通过在向后通行期间在网络体系结构的叉子上加入来自不同目标函数的梯度来解决它,称为叉范围。实验验证是在JAAD上进行的,JAAD是一个数据集,该数据集为自动驾驶汽车提供了许多用于行人分析的属性,并显示了竞争性的检测和属性识别结果,以及更稳定的MTL培训。
Pedestrians are arguably one of the most safety-critical road users to consider for autonomous vehicles in urban areas. In this paper, we address the problem of jointly detecting pedestrians and recognizing 32 pedestrian attributes from a single image. These encompass visual appearance and behavior, and also include the forecasting of road crossing, which is a main safety concern. For this, we introduce a Multi-Task Learning (MTL) model relying on a composite field framework, which achieves both goals in an efficient way. Each field spatially locates pedestrian instances and aggregates attribute predictions over them. This formulation naturally leverages spatial context, making it well suited to low resolution scenarios such as autonomous driving. By increasing the number of attributes jointly learned, we highlight an issue related to the scales of gradients, which arises in MTL with numerous tasks. We solve it by normalizing the gradients coming from different objective functions when they join at the fork in the network architecture during the backward pass, referred to as fork-normalization. Experimental validation is performed on JAAD, a dataset providing numerous attributes for pedestrian analysis from autonomous vehicles, and shows competitive detection and attribute recognition results, as well as a more stable MTL training.