论文标题
解释深度学习输出以进行分发检测
Interpreting deep learning output for out-of-distribution detection
论文作者
论文摘要
通常使用的AI网络在预测中非常自信,即使某个决定的证据是可疑的。对深度学习模型输出的调查对于理解其决策过程并评估其能力和局限性至关重要。通过分析原始网络输出向量的分布,可以观察到每个类都有自己的决策边界,因此相同的原始输出值对不同类别具有不同的支持。受到这一事实的启发,我们开发了一种新的分布检测方法。该方法提供了一个解释性步骤,超出了SoftMax输出的简单阈值,以理解和解释模型学习过程及其输出。它没有将最高logit的最高logit标签分配给提供给网络的每个新示例,而是考虑所有类别的分布。概率分数解释器(PSI)是基于与其各自的正确类别分布相关的联合logit值创建的。 PSI建议样本是否可能属于特定类,网络是否不确定,或者样本是否可能是网络的异常值或未知类型。简单的PSI具有适用于已经训练的网络的好处。通过简单地通过训练有素的网络运行培训示例来确定每个输出节点的正确与错误类的分布。我们在具有挑战性的传输电子显微镜病毒图像数据集上演示了我们的OOD检测方法。我们模拟了一个现实世界中的应用,其中训练有素的病毒分类器未知病毒类型的图像,但使用相同的程序和仪器获得了OOD样品。
Commonly used AI networks are very self-confident in their predictions, even when the evidence for a certain decision is dubious. The investigation of a deep learning model output is pivotal for understanding its decision processes and assessing its capabilities and limitations. By analyzing the distributions of raw network output vectors, it can be observed that each class has its own decision boundary and, thus, the same raw output value has different support for different classes. Inspired by this fact, we have developed a new method for out-of-distribution detection. The method offers an explanatory step beyond simple thresholding of the softmax output towards understanding and interpretation of the model learning process and its output. Instead of assigning the class label of the highest logit to each new sample presented to the network, it takes the distributions over all classes into consideration. A probability score interpreter (PSI) is created based on the joint logit values in relation to their respective correct vs wrong class distributions. The PSI suggests whether the sample is likely to belong to a specific class, whether the network is unsure, or whether the sample is likely an outlier or unknown type for the network. The simple PSI has the benefit of being applicable on already trained networks. The distributions for correct vs wrong class for each output node are established by simply running the training examples through the trained network. We demonstrate our OOD detection method on a challenging transmission electron microscopy virus image dataset. We simulate a real-world application in which images of virus types unknown to a trained virus classifier, yet acquired with the same procedures and instruments, constitute the OOD samples.