论文标题
基于未来特征的有条件生成的多模式语义预测
Multimodal semantic forecasting based on conditional generation of future features
论文作者
论文摘要
本文考虑了公路驾驶场景中的语义预测。大多数现有方法将此问题作为对未来特征或未来预测的确定性回归解决,给出了观察到的帧。但是,这种方法忽略了以下事实:未来并不能总是可以肯定地猜测。例如,当汽车即将转弯时,目前被建筑物遮住的道路可能是自由开车的,或者被人,其他车辆或道路工程占据。当确定性模型面对这种情况时,最好的猜测是预测最可能的结果。但是,这是不可接受的,因为它破坏了预测提高安全性的目的。由于确定性模型无法学习与规范的任何偏差,因此它还丢弃了有价值的培训数据。我们通过允许模型预测不同未来的模型来解决这个问题。我们建议将多模式预测作为在观测帧上进行的多模式生成模型的采样。 CityScapes数据集的实验表明,我们的多模式模型在短期预测中优于确定性的,而在中期案例中的表现稍差。
This paper considers semantic forecasting in road-driving scenes. Most existing approaches address this problem as deterministic regression of future features or future predictions given observed frames. However, such approaches ignore the fact that future can not always be guessed with certainty. For example, when a car is about to turn around a corner, the road which is currently occluded by buildings may turn out to be either free to drive, or occupied by people, other vehicles or roadworks. When a deterministic model confronts such situation, its best guess is to forecast the most likely outcome. However, this is not acceptable since it defeats the purpose of forecasting to improve security. It also throws away valuable training data, since a deterministic model is unable to learn any deviation from the norm. We address this problem by providing more freedom to the model through allowing it to forecast different futures. We propose to formulate multimodal forecasting as sampling of a multimodal generative model conditioned on the observed frames. Experiments on the Cityscapes dataset reveal that our multimodal model outperforms its deterministic counterpart in short-term forecasting while performing slightly worse in the mid-term case.