基于模型的图像信号处理器通过可学习的词典

论文标题

基于模型的图像信号处理器通过可学习的词典

Model-Based Image Signal Processors via Learnable Dictionaries

论文作者

Conde, Marcos V., McDonagh, Steven, Maggioni, Matteo, Leonardis, Aleš, Pérez-Pellitero, Eduardo

论文摘要

数码相机通过其图像信号处理器（ISP）将传感器原始读数转换为RGB图像。计算摄影任务（例如deNoising和color Constancy）通常在原始域中执行，部分原因是固有的硬件设计，这也是由于直接传感器读数产生的噪声统计信息的简单性。尽管如此，与可用RGB数据的丰度和多样性相比，原始图像的可用性受到限制。最近的方法试图通过将RGB估算为原始映射来弥合这一差距：可解释和可控的基于手工模型的方法通常需要手动参数进行微调，而端到端可学习的神经网络有时需要大量的培训数据，有时需要使用复杂的培训程序，并且通常缺乏解释性和参数控制。为了解决这些现有的局限性，我们提出了一种基于合理的ISP操作的新型基于混合模型和数据驱动的ISP，既可以学习又可以解释。我们提出的可逆模型，能够在RAW和RGB域之间进行双向映射，采用了对丰富参数表示形式的端到端学习，即词典，这些字典没有直接的参数监督，并启用了简单且合理的数据增强。我们通过在原始图像重建和原始图像降解任务下进行大量实验来证明我们的数据生成过程的价值，从而获得了两者的最新性能。此外，我们表明我们的ISP可以从很少的数据示例中学习有意义的映射，并且尽管只有很少或零地面真相标签，但通过基于字典的数据增强训练的模型具有竞争力。

Digital cameras transform sensor RAW readings into RGB images by means of their Image Signal Processor (ISP). Computational photography tasks such as image denoising and colour constancy are commonly performed in the RAW domain, in part due to the inherent hardware design, but also due to the appealing simplicity of noise statistics that result from the direct sensor readings. Despite this, the availability of RAW images is limited in comparison with the abundance and diversity of available RGB data. Recent approaches have attempted to bridge this gap by estimating the RGB to RAW mapping: handcrafted model-based methods that are interpretable and controllable usually require manual parameter fine-tuning, while end-to-end learnable neural networks require large amounts of training data, at times with complex training procedures, and generally lack interpretability and parametric control. Towards addressing these existing limitations, we present a novel hybrid model-based and data-driven ISP that builds on canonical ISP operations and is both learnable and interpretable. Our proposed invertible model, capable of bidirectional mapping between RAW and RGB domains, employs end-to-end learning of rich parameter representations, i.e. dictionaries, that are free from direct parametric supervision and additionally enable simple and plausible data augmentation. We evidence the value of our data generation process by extensive experiments under both RAW image reconstruction and RAW image denoising tasks, obtaining state-of-the-art performance in both. Additionally, we show that our ISP can learn meaningful mappings from few data samples, and that denoising models trained with our dictionary-based data augmentation are competitive despite having only few or zero ground-truth labels.

下载PDF全文

下载文献需遵守相关版权规定

论文标题