论文标题
FOVEATED视频超分辨率的跨分辨率流传播
Cross-Resolution Flow Propagation for Foveated Video Super-Resolution
论文作者
论文摘要
多年来,高分辨率视频内容的需求不断增长。但是,高分辨率视频的交付受到渲染所需的计算资源或远程传输的网络带宽所需的限制。为了弥补这一限制,我们利用与现有的增强和虚拟现实耳机一起发现的眼动追踪器。我们建议将视频超分辨率(VSR)技术应用于融合低分辨率上下文与区域高分辨率上下文的融合,以用于资源受限的高分辨率内容消费,而质量下降而不可口可乐。眼睛跟踪器为我们提供了用户的凝视方向,从而在区域高分辨率上下文的提取中为我们提供了帮助。由于只有人眼可以解决凝视区域内的像素,因此,大量传递的内容是冗余的,因为我们无法理解观察到的区域以外的区域的质量差异。为了从高分辨率区域和低分辨率区域的融合中产生一个令人愉悦的框架,我们研究了将观测区域的上下文转移到当前和未来框架的其他区域(低分辨率)的深度神经网络的能力。我们将这项任务标记为具有foveat的视频超分辨率(FVSR),因为我们需要通过从凝视区域的像素融合来超级溶解当前和未来帧的低分辨率区域。我们提出了FVSR的跨分辨率流传播(CRFP)。我们在REDS数据集上训练和评估CRFP关于8x FVSR的任务,即8x VSR和Foveated区域的融合。我们脱离了使用SSIM或PSNR对每个框架质量的常规评估,我们提出了对过去的Foveated区域的评估,测量了模型在FVSR期间利用眼睛跟踪器中存在的噪声的能力。代码可在https://github.com/eugenelet/crfp上提供。
The demand of high-resolution video contents has grown over the years. However, the delivery of high-resolution video is constrained by either computational resources required for rendering or network bandwidth for remote transmission. To remedy this limitation, we leverage the eye trackers found alongside existing augmented and virtual reality headsets. We propose the application of video super-resolution (VSR) technique to fuse low-resolution context with regional high-resolution context for resource-constrained consumption of high-resolution content without perceivable drop in quality. Eye trackers provide us the gaze direction of a user, aiding us in the extraction of the regional high-resolution context. As only pixels that falls within the gaze region can be resolved by the human eye, a large amount of the delivered content is redundant as we can't perceive the difference in quality of the region beyond the observed region. To generate a visually pleasing frame from the fusion of high-resolution region and low-resolution region, we study the capability of a deep neural network of transferring the context of the observed region to other regions (low-resolution) of the current and future frames. We label this task a Foveated Video Super-Resolution (FVSR), as we need to super-resolve the low-resolution regions of current and future frames through the fusion of pixels from the gaze region. We propose Cross-Resolution Flow Propagation (CRFP) for FVSR. We train and evaluate CRFP on REDS dataset on the task of 8x FVSR, i.e. a combination of 8x VSR and the fusion of foveated region. Departing from the conventional evaluation of per frame quality using SSIM or PSNR, we propose the evaluation of past foveated region, measuring the capability of a model to leverage the noise present in eye trackers during FVSR. Code is made available at https://github.com/eugenelet/CRFP.