论文标题
通过灵活的激活功能,神经网络中可扩展的部分解释性
Scalable Partial Explainability in Neural Networks via Flexible Activation Functions
论文作者
论文摘要
在黑盒深度学习算法中实现透明度仍然是一个开放的挑战。深度神经网络(NN)给出的高维特征和决策需要新的算法和方法来暴露其机制。当前最新的NN解释方法(例如显着图,深色,石灰等)更多地关注NN输出和输入之间的直接关系,而不是NN结构和操作本身。在当前的深入NN操作中,具有固定激活功能的神经元所起的确切作用存在不确定性。在本文中,我们通过象征性地解释了可扩展拓扑的激活函数(AF)的作用来实现部分解释的学习模型。这是通过基于Kolmogorov-Arnold叠加定理(KST)为基于新颖的可扩展NN拓扑结构的自适应高斯工艺(GP)来对AFS进行建模的。在此可扩展的NN体系结构中,AFS是通过控制点之间的GP插值生成的,因此可以通过梯度下降在背部传播过程中调整AFS。控制点充当了AF的局部和全局可调性的核心推动力,其中GP插值限制了内在的自相关以避免过度拟合。我们表明,在线性KST拓扑扩展下,NN的表现力和解释复杂性之间存在权衡。为了证明这一点,我们对钞票身份验证的二进制分类数据集进行了案例研究。通过定量和定性地研究输入和输出之间的映射关系,我们的可解释模型可以对每个一维属性提供解释。这些早期结果表明,我们的模型有可能充当深层神经网络的最终解释层。
Achieving transparency in black-box deep learning algorithms is still an open challenge. High dimensional features and decisions given by deep neural networks (NN) require new algorithms and methods to expose its mechanisms. Current state-of-the-art NN interpretation methods (e.g. Saliency maps, DeepLIFT, LIME, etc.) focus more on the direct relationship between NN outputs and inputs rather than the NN structure and operations itself. In current deep NN operations, there is uncertainty over the exact role played by neurons with fixed activation functions. In this paper, we achieve partially explainable learning model by symbolically explaining the role of activation functions (AF) under a scalable topology. This is carried out by modeling the AFs as adaptive Gaussian Processes (GP), which sit within a novel scalable NN topology, based on the Kolmogorov-Arnold Superposition Theorem (KST). In this scalable NN architecture, the AFs are generated by GP interpolation between control points and can thus be tuned during the back-propagation procedure via gradient descent. The control points act as the core enabler to both local and global adjustability of AF, where the GP interpolation constrains the intrinsic autocorrelation to avoid over-fitting. We show that there exists a trade-off between the NN's expressive power and interpretation complexity, under linear KST topology scaling. To demonstrate this, we perform a case study on a binary classification dataset of banknote authentication. By quantitatively and qualitatively investigating the mapping relationship between inputs and output, our explainable model can provide interpretation over each of the one-dimensional attributes. These early results suggest that our model has the potential to act as the final interpretation layer for deep neural networks.