可视化贝叶斯神经网络学到的表示的多样性

论文标题

可视化贝叶斯神经网络学到的表示的多样性

Visualizing the Diversity of Representations Learned by Bayesian Neural Networks

论文作者

Grinwald, Dennis, Bykov, Kirill, Nakajima, Shinichi, Höhne, Marina M. -C.

论文摘要

可解释的人工智能（XAI）旨在使学习机不透明，并为研究人员和从业人员提供各种工具，以揭示神经网络的决策策略。在这项工作中，我们研究了如何将XAI方法用于探索和可视化贝叶斯神经网络（BNNS）学到的特征表示的多样性。我们的目标是通过制定决策策略来提供对BNN的全球理解。我们的工作提供了有关\ emph {posterior}分布的新见解。我们工作的主要发现如下：1）可以应用全局XAI方法来解释BNN实例决策策略的多样性，2）与多模态后近似值相比，多模态后近似值的多样性近似值的多样性及其多样性的多样性，3）蒙特卡洛辍学表现出更大的特征表示多样性，3）不具体的代表性，3）3）随着网络宽度的增加，多模式后验减小，而内部模式多样性增加。这些发现与最近的深神网络理论一致，提供了有关理论在人类可以理解的概念方面所暗示的其他直觉。

Explainable Artificial Intelligence (XAI) aims to make learning machines less opaque, and offers researchers and practitioners various tools to reveal the decision-making strategies of neural networks. In this work, we investigate how XAI methods can be used for exploring and visualizing the diversity of feature representations learned by Bayesian Neural Networks (BNNs). Our goal is to provide a global understanding of BNNs by making their decision-making strategies a) visible and tangible through feature visualizations and b) quantitatively measurable with a distance measure learned by contrastive learning. Our work provides new insights into the \emph{posterior} distribution in terms of human-understandable feature information with regard to the underlying decision making strategies. The main findings of our work are the following: 1) global XAI methods can be applied to explain the diversity of decision-making strategies of BNN instances, 2) Monte Carlo dropout with commonly used Dropout rates exhibit increased diversity in feature representations compared to the multimodal posterior approximation of MultiSWAG, 3) the diversity of learned feature representations highly correlates with the uncertainty estimate for the output and 4) the inter-mode diversity of the multimodal posterior decreases as the network width increases, while the intra mode diversity increases. These findings are consistent with the recent Deep Neural Networks theory, providing additional intuitions about what the theory implies in terms of humanly understandable concepts.

下载PDF全文

下载文献需遵守相关版权规定

论文标题