论文标题
通过区分特定班级过滤器来培训可解释的卷积神经网络
Training Interpretable Convolutional Neural Networks by Differentiating Class-specific Filters
论文作者
论文摘要
卷积神经网络(CNN)已成功地用于一系列任务。但是,CNN通常被视为“黑盒”,缺乏解释性。一个主要原因是由于过滤阶段的纠缠 - 过滤器和类之间的多个多到多的对应关系。大多数现有的作品都尝试在预训练的模型上进行事后解释,同时忽略减少模型的纠缠。相比之下,我们专注于减轻培训期间的过滤级纠缠。受细胞分化的启发,我们提出了一种新型策略,通过鼓励特定于类的过滤器来训练可解释的CNN,其中每个过滤器仅响应一个(或几个)类。具体而言,我们设计了一个可学习的稀疏类特异性门(CSG)结构,以灵活的方式将每个过滤器分配给每个过滤器。该栅极允许仅当输入样本来自特定类时,才能通过过滤器的激活传递。广泛的实验证明了我们方法在产生输入的稀疏且高度相关的表示方面的出色表现,从而导致更强的解释性。此外,与标准培训策略相比,我们的模型在对象定位和对抗性样本检测等应用中显示出好处。代码链接:https://github.com/hyliang96/csgcnn。
Convolutional neural networks (CNNs) have been successfully used in a range of tasks. However, CNNs are often viewed as "black-box" and lack of interpretability. One main reason is due to the filter-class entanglement -- an intricate many-to-many correspondence between filters and classes. Most existing works attempt post-hoc interpretation on a pre-trained model, while neglecting to reduce the entanglement underlying the model. In contrast, we focus on alleviating filter-class entanglement during training. Inspired by cellular differentiation, we propose a novel strategy to train interpretable CNNs by encouraging class-specific filters, among which each filter responds to only one (or few) class. Concretely, we design a learnable sparse Class-Specific Gate (CSG) structure to assign each filter with one (or few) class in a flexible way. The gate allows a filter's activation to pass only when the input samples come from the specific class. Extensive experiments demonstrate the fabulous performance of our method in generating a sparse and highly class-related representation of the input, which leads to stronger interpretability. Moreover, comparing with the standard training strategy, our model displays benefits in applications like object localization and adversarial sample detection. Code link: https://github.com/hyliang96/CSGCNN.