论文标题

随机视频预测的深度变异Luenberger型观察者

Deep Variational Luenberger-type Observer for Stochastic Video Prediction

论文作者

Wang, Dong, Zhou, Feng, Yan, Zheng, Yao, Guang, Liu, Zongxuan, Ma, Wennan, Lu, Cewu

论文摘要

考虑到固有的随机性和不确定性,预测未来的视频帧非常具有挑战性。在这项工作中,我们通过结合随机状态空间模型的解释性和深度神经网络的表示,研究视频预测的问题。我们的模型建立在一个多变量编码器上,该编码器将输入视频转换为潜在特征空间和Luenberger-type观察者,该观察者捕获了潜在特征的动态演变。这使视频以无监督的方式将视频分解为静态特征和动态。通过得出非线性luenberger型观察者的稳定性理论,特征空间中的隐藏状态在初始值方面变得不敏感,从而改善了整体模型的鲁棒性。此外,可以得出数据对数似然的变异下限以根据变异原理获得可拖动的后验预测分布。最后,提供了诸如弹跳球数据集和摆布数据集之类的实验,以证明所提出的模型优于同时工作。

Considering the inherent stochasticity and uncertainty, predicting future video frames is exceptionally challenging. In this work, we study the problem of video prediction by combining interpretability of stochastic state space models and representation learning of deep neural networks. Our model builds upon an variational encoder which transforms the input video into a latent feature space and a Luenberger-type observer which captures the dynamic evolution of the latent features. This enables the decomposition of videos into static features and dynamics in an unsupervised manner. By deriving the stability theory of the nonlinear Luenberger-type observer, the hidden states in the feature space become insensitive with respect to the initial values, which improves the robustness of the overall model. Furthermore, the variational lower bound on the data log-likelihood can be derived to obtain the tractable posterior prediction distribution based on the variational principle. Finally, the experiments such as the Bouncing Balls dataset and the Pendulum dataset are provided to demonstrate the proposed model outperforms concurrent works.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源