论文标题
Model2-Detector:使用少数梯度步骤扩大信息瓶颈,以进行分发检测
Model2Detector: Widening the Information Bottleneck for Out-of-Distribution Detection using a Handful of Gradient Steps
论文作者
论文摘要
分布式检测是长期以来一直避开香草神经网络的重要功能。当出现明显失去分布的输入(OOD)时,深度神经网络(DNNS)往往会产生过度自信的预测。因此,当在野外采用机器学习系统时,这可能是危险的,因为检测攻击可能很困难。最近的进步推理时间分布检测有助于缓解其中一些问题。但是,现有方法可能是限制的,因为它们通常在计算上很昂贵。此外,这些方法需要培训下游检测器模型,该模型学会从分布中检测OOD输入。因此,这增加了推断期间的延迟。在这里,我们提供了一个信息理论观点,介绍了为什么神经网络本质上无能为力。我们试图通过使用少量梯度下降的步骤将受过训练的模型转换为OOD检测器来减轻这些缺陷。我们的工作可以用作后处理方法,通过推理时间ML系统可以将训练有素的模型转换为OOD检测器。在实验上,我们展示了我们的方法在流行图像数据集上的检测准确性方面的表现如何始终如一,同时降低了计算复杂性。
Out-of-distribution detection is an important capability that has long eluded vanilla neural networks. Deep Neural networks (DNNs) tend to generate over-confident predictions when presented with inputs that are significantly out-of-distribution (OOD). This can be dangerous when employing machine learning systems in the wild as detecting attacks can thus be difficult. Recent advances inference-time out-of-distribution detection help mitigate some of these problems. However, existing methods can be restrictive as they are often computationally expensive. Additionally, these methods require training of a downstream detector model which learns to detect OOD inputs from in-distribution ones. This, therefore, adds latency during inference. Here, we offer an information theoretic perspective on why neural networks are inherently incapable of OOD detection. We attempt to mitigate these flaws by converting a trained model into a an OOD detector using a handful of steps of gradient descent. Our work can be employed as a post-processing method whereby an inference-time ML system can convert a trained model into an OOD detector. Experimentally, we show how our method consistently outperforms the state-of-the-art in detection accuracy on popular image datasets while also reducing computational complexity.