运输旅行时间预测的计算机视觉：使用路边城市图像的端到端框架

论文标题

运输旅行时间预测的计算机视觉：使用路边城市图像的端到端框架

Computer Vision for Transit Travel Time Prediction: An End-to-End Framework Using Roadside Urban Imagery

论文作者

Abdelhalim, Awad, Zhao, Jinhua

论文摘要

准确的旅行时间估计对于向运输用户提供可靠的时间表和可靠的实时信息至关重要。本文是第一个利用路边城市图像进行直接运输旅行时间预测的论文。我们提出和评估一个端到端框架，将传统的运输数据源与路边摄像头集成在一起，用于自动路边图像数据采集，标签和模型培训，以预测遍布感兴趣的部分的运输旅行时间。首先，我们展示了如何将GTFS实时数据用作路边摄像头单元监视感兴趣的部分的有效激活机制。其次，在图像获取期间，基于在摄像头监控段中观察到的过境行程百分位数，使用AVL数据来为获得的图像生成地面真相标签。最后，生成的标记的图像数据集用于训练并彻底评估视觉变压器（VIT）模型，以预测离散的运输旅行时间范围（频段）。结果表明，VIT模型能够学习图像功能和内容，最能帮助其推导预期的旅行时间范围，平均验证精度在80％-85％之间。我们评估了VIT模型预测的解释性，并展示了该离散的旅行时间段预测如何随后可以改善连续的运输旅行时间估计。这项研究中介绍的工作流程和结果提供了一种端到端，可扩展，自动化和高效的方法，用于整合传统的运输数据源和路边图像，以改善过境行程持续时间的估计。这项工作还展示了从计算机视觉来源中合并实时信息的价值，这些信息变得越来越容易访问，并且可能对改善操作和乘客实时信息产生重大影响。

Accurate travel time estimation is paramount for providing transit users with reliable schedules and dependable real-time information. This paper is the first to utilize roadside urban imagery for direct transit travel time prediction. We propose and evaluate an end-to-end framework integrating traditional transit data sources with a roadside camera for automated roadside image data acquisition, labeling, and model training to predict transit travel times across a segment of interest. First, we show how the GTFS real-time data can be utilized as an efficient activation mechanism for a roadside camera unit monitoring a segment of interest. Second, AVL data is utilized to generate ground truth labels for the acquired images based on the observed transit travel time percentiles across the camera-monitored segment during the time of image acquisition. Finally, the generated labeled image dataset is used to train and thoroughly evaluate a Vision Transformer (ViT) model to predict a discrete transit travel time range (band). The results illustrate that the ViT model is able to learn image features and contents that best help it deduce the expected travel time range with an average validation accuracy ranging between 80%-85%. We assess the interpretability of the ViT model's predictions and showcase how this discrete travel time band prediction can subsequently improve continuous transit travel time estimation. The workflow and results presented in this study provide an end-to-end, scalable, automated, and highly efficient approach for integrating traditional transit data sources and roadside imagery to improve the estimation of transit travel duration. This work also demonstrates the value of incorporating real-time information from computer-vision sources, which are becoming increasingly accessible and can have major implications for improving operations and passenger real-time information.

下载PDF全文

下载文献需遵守相关版权规定

论文标题