论文标题
部分可观测时空混沌系统的无模型预测
SelfD: Self-Learning Large-Scale Driving Policies From the Web
论文作者
论文摘要
有效地利用互联网上免费获得的大量以自我为中心的导航数据可以推动通用的智能系统,即,跨越透视,平台,环境条件,场景和地理位置的跨越跨越。但是,对于复杂的3D推理和计划任务,很难直接利用如此大量的未标记和高度多样的数据。因此,研究人员主要关注其用于各种辅助像素和图像级计算机视觉任务任务,这些任务不考虑最终的导航目标。在这项工作中,我们介绍了SelfD,这是一个通过使用大量在线单眼图像来学习可扩展驾驶的框架。我们的关键思想是从未标记的数据中学习模仿药物时,要利用迭代性半监督训练。为了处理无约束的观点,场景和相机参数,我们训练一个基于图像的模型,该模型直接学习在鸟类视图(BEV)空间中计划。接下来,我们使用未标记的数据来通过自我培训来增强最初受过训练的模型的决策知识和鲁棒性。特别是,我们提出了一个伪标记的步骤,该步骤可以通过基于“假设”计划的数据增强来充分利用高度多样化的演示数据。我们采用大量公开可用的YouTube视频数据集来培训SelfD,并在具有挑战性的导航方案中全面分析其泛化优势。在不需要任何其他数据收集或注释工作的情况下,SelfD在推动Nuscenes,Argoverse,Waymo和Carla的绩效评估方面表现出一致的改进(高达24%)。
Effectively utilizing the vast amounts of ego-centric navigation data that is freely available on the internet can advance generalized intelligent systems, i.e., to robustly scale across perspectives, platforms, environmental conditions, scenarios, and geographical locations. However, it is difficult to directly leverage such large amounts of unlabeled and highly diverse data for complex 3D reasoning and planning tasks. Consequently, researchers have primarily focused on its use for various auxiliary pixel- and image-level computer vision tasks that do not consider an ultimate navigational objective. In this work, we introduce SelfD, a framework for learning scalable driving by utilizing large amounts of online monocular images. Our key idea is to leverage iterative semi-supervised training when learning imitative agents from unlabeled data. To handle unconstrained viewpoints, scenes, and camera parameters, we train an image-based model that directly learns to plan in the Bird's Eye View (BEV) space. Next, we use unlabeled data to augment the decision-making knowledge and robustness of an initially trained model via self-training. In particular, we propose a pseudo-labeling step which enables making full use of highly diverse demonstration data through "hypothetical" planning-based data augmentation. We employ a large dataset of publicly available YouTube videos to train SelfD and comprehensively analyze its generalization benefits across challenging navigation scenarios. Without requiring any additional data collection or annotation efforts, SelfD demonstrates consistent improvements (by up to 24%) in driving performance evaluation on nuScenes, Argoverse, Waymo, and CARLA.