为什么在层次学习方面很难扩展以及通过加速下采样的可能解决方案

论文标题

为什么在层次学习方面很难扩展以及通过加速下采样的可能解决方案

Why Layer-Wise Learning is Hard to Scale-up and a Possible Solution via Accelerated Downsampling

论文作者

Ma, Wenchi, Yu, Miao, Li, Kaidong, Wang, Guanghui

论文摘要

层次学习是全球背部传播的替代方案，易于解释，分析，并且是有效的。最近的研究表明，层次学习可以在各种数据集上实现图像分类的最新性能。但是，先前对层学习的研究仅限于具有简单层次结构的网络，并且对于像Resnet这样的更深层网络的性能会严重降低。本文首次揭示了阻碍层次学习规模的基本原因，这是由于特征空间在浅层层中的分离性相对较差。通过控制局部层中卷积操作的强度，可以从经验上验证该论点。我们发现，浅层层的不可分割的特征与整个网络的强大监督约束不匹配，从而使层次学习对网络深度敏感。本文进一步提出了一种降采样加速度的方法，以削弱浅层层的不良学习，以便将学习重点转移到深度特征空间中，在这种空间中，可分离性与监督约束更匹配。已经进行了广泛的实验，以验证新发现并证明拟议的下采样加速度在改善层学习的性能方面的优势。

Layer-wise learning, as an alternative to global back-propagation, is easy to interpret, analyze, and it is memory efficient. Recent studies demonstrate that layer-wise learning can achieve state-of-the-art performance in image classification on various datasets. However, previous studies of layer-wise learning are limited to networks with simple hierarchical structures, and the performance decreases severely for deeper networks like ResNet. This paper, for the first time, reveals the fundamental reason that impedes the scale-up of layer-wise learning is due to the relatively poor separability of the feature space in shallow layers. This argument is empirically verified by controlling the intensity of the convolution operation in local layers. We discover that the poorly-separable features from shallow layers are mismatched with the strong supervision constraint throughout the entire network, making the layer-wise learning sensitive to network depth. The paper further proposes a downsampling acceleration approach to weaken the poor learning of shallow layers so as to transfer the learning emphasis to deep feature space where the separability matches better with the supervision restraint. Extensive experiments have been conducted to verify the new finding and demonstrate the advantages of the proposed downsampling acceleration in improving the performance of layer-wise learning.

下载PDF全文

下载文献需遵守相关版权规定

论文标题