论文标题
通过算法展开的高效且基于模型的红外和可见图像融合
Efficient and Model-Based Infrared and Visible Image Fusion Via Algorithm Unrolling
论文作者
论文摘要
红外和可见图像融合(IVIF)期望从可见图像中保留从红外图像和纹理细节中保留热辐射信息的图像。在本文中,提出了一种基于模型的卷积神经网络(CNN)模型,称为算法展开图像融合(AUIF),以克服传统基于CNN的IVIF模型的缺点。提出的AUIF模型从两个传统优化模型的迭代公式开始,这些模型是为完成两尺度分解的建立的,即将低频基础信息和高频详细信息与源图像分开。然后,在将每个迭代映射到CNN层并将每个优化模型转换为可训练的神经网络的情况下,实现了算法展开。与一般网络体系结构相比,提出的框架结合了基于模型的先验信息,并且设计更合理。展开操作后,我们的模型包含两个分解器(编码器)和一个其他重建器(解码器)。在训练阶段,该网络经过训练以重建输入图像。在测试阶段时,红外/可见图像的碱基(或细节)分解的特征图分别由额外的融合层合并,然后解码器输出融合图像。定性和定量比较证明了我们的模型的优势,该模型可以强劲地生成包含突出显示目标和清晰细节的融合图像,超过了最新的方法。此外,我们的网络的权重和更快的速度更少。
Infrared and visible image fusion (IVIF) expects to obtain images that retain thermal radiation information from infrared images and texture details from visible images. In this paper, a model-based convolutional neural network (CNN) model, referred to as Algorithm Unrolling Image Fusion (AUIF), is proposed to overcome the shortcomings of traditional CNN-based IVIF models. The proposed AUIF model starts with the iterative formulas of two traditional optimization models, which are established to accomplish two-scale decomposition, i.e., separating low-frequency base information and high-frequency detail information from source images. Then the algorithm unrolling is implemented where each iteration is mapped to a CNN layer and each optimization model is transformed into a trainable neural network. Compared with the general network architectures, the proposed framework combines the model-based prior information and is designed more reasonably. After the unrolling operation, our model contains two decomposers (encoders) and an additional reconstructor (decoder). In the training phase, this network is trained to reconstruct the input image. While in the test phase, the base (or detail) decomposed feature maps of infrared/visible images are merged respectively by an extra fusion layer, and then the decoder outputs the fusion image. Qualitative and quantitative comparisons demonstrate the superiority of our model, which can robustly generate fusion images containing highlight targets and legible details, exceeding the state-of-the-art methods. Furthermore, our network has fewer weights and faster speed.