论文标题
一眼的视觉:细和粗信息处理途径之间的相互作用
Vision at A Glance: Interplay between Fine and Coarse Information Processing Pathways
论文作者
论文摘要
物体识别通常被视为机器学习中的馈电,自下而上的过程,但是在实际的神经系统中,对象识别是一个复杂的过程,涉及两个信号途径之间的相互作用。一个是副细胞通路(P-Pathway),它是缓慢的,并提取物体的精细特征。另一个是大细胞通路(M-Pathway),它是快速的,并提取物体的粗糙特征。已经提出,两种途径之间的相互作用赋予神经系统的能力快速,适应性和稳健地处理视觉信息。但是,基本的计算机制仍然在很大程度上未知。在这项研究中,我们建立了一个计算模型,以阐明与两种途径之间的相互作用相关的计算优势。我们的模型由两个卷积神经网络组成:一个模仿Pathway,称为Finenet,深层具有小型内核,并接收详细的视觉输入;其他模拟M-Pathway(称为Coarsenet)具有大型内核,并接收低通滤波或二氧化视觉输入。这两个途径通过受限的玻尔兹曼机器相互交互。我们发现:1)Finenet可以通过模仿来教授Coarsenet并大大提高其表现; 2)Coarsenet可以通过关联来改善Finenet的噪声稳健性; 3)Coarsenet的输出可以作为改善Finenet性能的认知偏见。我们希望这项研究能够洞悉理解视觉信息处理并激发新对象识别体系结构的发展。
Object recognition is often viewed as a feedforward, bottom-up process in machine learning, but in real neural systems, object recognition is a complicated process which involves the interplay between two signal pathways. One is the parvocellular pathway (P-pathway), which is slow and extracts fine features of objects; the other is the magnocellular pathway (M-pathway), which is fast and extracts coarse features of objects. It has been suggested that the interplay between the two pathways endows the neural system with the capacity of processing visual information rapidly, adaptively, and robustly. However, the underlying computational mechanisms remain largely unknown. In this study, we build a computational model to elucidate the computational advantages associated with the interactions between two pathways. Our model consists of two convolution neural networks: one mimics the P-pathway, referred to as FineNet, which is deep, has small-size kernels, and receives detailed visual inputs; the other mimics the M-pathway, referred to as CoarseNet, which is shallow, has large-size kernels, and receives low-pass filtered or binarized visual inputs. The two pathways interact with each other via a Restricted Boltzmann Machine. We find that: 1) FineNet can teach CoarseNet through imitation and improve its performance considerably; 2) CoarseNet can improve the noise robustness of FineNet through association; 3) the output of CoarseNet can serve as a cognitive bias to improve the performance of FineNet. We hope that this study will provide insight into understanding visual information processing and inspire the development of new object recognition architectures.