论文标题
使用深度时间编码编码的多个基于实例的视频异常检测
Multiple Instance-Based Video Anomaly Detection using Deep Temporal Encoding-Decoding
论文作者
论文摘要
在本文中,我们建议使用多个实例学习在监视视频中进行弱监督的深度时间编码解码解决方案。拟议的方法在训练阶段使用异常和普通视频剪辑,该阶段是在多个实例框架中开发的,我们将视频视为袋子和视频剪辑作为袋子中的实例。我们的主要贡献在于拟议的新方法,以考虑视频实例之间的时间关系。我们将视频实例(剪辑)作为顺序视觉数据进行处理,而不是独立的实例。我们采用了一个深度的时间和编码网络,该网络旨在随着时间的推移捕获视频实例的时空演化。我们还提出了一种比最近在计算机视觉文献中介绍的类似损失函数更光滑的新损失函数,因此。在训练阶段,享受更快的融合并提高了对本地最小值的容忍度。在模拟研究中,针对最新的损失基准了拟议的时间编码解码方法。结果表明,所提出的方法的性能与视频监视应用中的异常检测相似或更好。
In this paper, we propose a weakly supervised deep temporal encoding-decoding solution for anomaly detection in surveillance videos using multiple instance learning. The proposed approach uses both abnormal and normal video clips during the training phase which is developed in the multiple instance framework where we treat video as a bag and video clips as instances in the bag. Our main contribution lies in the proposed novel approach to consider temporal relations between video instances. We deal with video instances (clips) as a sequential visual data rather than independent instances. We employ a deep temporal and encoder network that is designed to capture spatial-temporal evolution of video instances over time. We also propose a new loss function that is smoother than similar loss functions recently presented in the computer vision literature, and therefore; enjoys faster convergence and improved tolerance to local minima during the training phase. The proposed temporal encoding-decoding approach with modified loss is benchmarked against the state-of-the-art in simulation studies. The results show that the proposed method performs similar to or better than the state-of-the-art solutions for anomaly detection in video surveillance applications.