论文标题

分析深度学习图像着色的不同损失

Analysis of Different Losses for Deep Learning Image Colorization

论文作者

Ballester, Coloma, Bugeau, Aurélie, Carrillo, Hernan, Clément, Michaël, Giraud, Rémi, Raad, Lara, Vitoria, Patricia

论文摘要

图像着色旨在以现实的方式将颜色信息添加到灰度图像中。最近的方法主要依赖于深度学习策略。在学会自动为图像着色时,可以定义与所需颜色输出相关的合理的目标功能。其中一些基于预测图像和地面真理之间的特定类型误差,而其他损失则取决于感知属性的比较。但是,对目标函数的选择是否至关重要,即它在结果中起重要作用?在本章中,我们旨在通过分析损失函数对估计着色结果的影响来回答这个问题。为了实现这一目标,我们回顾了文献中使用的不同损失和评估指标。然后,我们训练具有几个审查的目标功能的基线网络:经典的L1和L2损失,以及更复杂的组合,例如Wasserstein Gan和基于VGG的LPIPS损失。定量结果表明,经过基于VGG的LPIP训练的模型为大多数评估指标提供了总体上更好的结果。当Wasserstein Gan加上L2损失或与基于VGG的LPIP时,定性结果表现出更加生动的色彩。最后,还讨论了定量用户研究的便利性,以克服正确评估有色图像的困难,特别是对于没有可用地面真相的旧档案照片的情况。

Image colorization aims to add color information to a grayscale image in a realistic way. Recent methods mostly rely on deep learning strategies. While learning to automatically colorize an image, one can define well-suited objective functions related to the desired color output. Some of them are based on a specific type of error between the predicted image and ground truth one, while other losses rely on the comparison of perceptual properties. But, is the choice of the objective function that crucial, i.e., does it play an important role in the results? In this chapter, we aim to answer this question by analyzing the impact of the loss function on the estimated colorization results. To that goal, we review the different losses and evaluation metrics that are used in the literature. We then train a baseline network with several of the reviewed objective functions: classic L1 and L2 losses, as well as more complex combinations such as Wasserstein GAN and VGG-based LPIPS loss. Quantitative results show that the models trained with VGG-based LPIPS provide overall slightly better results for most evaluation metrics. Qualitative results exhibit more vivid colors when with Wasserstein GAN plus the L2 loss or again with the VGG-based LPIPS. Finally, the convenience of quantitative user studies is also discussed to overcome the difficulty of properly assessing on colorized images, notably for the case of old archive photographs where no ground truth is available.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源