论文标题
分布式信息瓶颈揭示了复杂系统的解释性结构
The Distributed Information Bottleneck reveals the explanatory structure of complex systems
论文作者
论文摘要
科学的果实是通过近似方式理解的关系。尽管深度学习是找到数据中关系的一种极为有力的方法,但由于理解学习关系的困难,它在科学中的使用受到了阻碍。信息瓶颈(IB)是一个信息理论框架,用于理解输入与输出之间的关系,以折叠性与近似关系的复杂性之间的权衡。在这里,我们表明一种关键的修改 - 在输入的多个组成部分中分发瓶颈 - 从根本上为科学中的可解释深度学习开辟了新的途径。分布式信息瓶颈会在输入组件之间的相互作用的下游复杂性,将关系分解为通过深度学习发现的有意义的近似值,而无需定制的数据集或神经网络架构。应用于复杂的系统,近似值通过限制和监视有关包含在近似中的不同组件的信息来照亮系统性质的各个方面。我们演示了分布式IB在从应用数学和凝结物理物理学中汲取的系统中的解释性实用程序。在前者中,我们将布尔电路解构为近似值,将输入组件的最有用的子集隔离而无需详尽的搜索。在后者中,我们将有关未来塑料重排的信息定位在剪切玻璃的静态结构中,并根据系统的准备,发现信息或多或少是分散的。通过有原则的近似方案,分布式IB带来了急需的解释性,以深入学习,并实现了对通过系统的信息流进行前所未有的分析。
The fruits of science are relationships made comprehensible, often by way of approximation. While deep learning is an extremely powerful way to find relationships in data, its use in science has been hindered by the difficulty of understanding the learned relationships. The Information Bottleneck (IB) is an information theoretic framework for understanding a relationship between an input and an output in terms of a trade-off between the fidelity and complexity of approximations to the relationship. Here we show that a crucial modification -- distributing bottlenecks across multiple components of the input -- opens fundamentally new avenues for interpretable deep learning in science. The Distributed Information Bottleneck throttles the downstream complexity of interactions between the components of the input, deconstructing a relationship into meaningful approximations found through deep learning without requiring custom-made datasets or neural network architectures. Applied to a complex system, the approximations illuminate aspects of the system's nature by restricting -- and monitoring -- the information about different components incorporated into the approximation. We demonstrate the Distributed IB's explanatory utility in systems drawn from applied mathematics and condensed matter physics. In the former, we deconstruct a Boolean circuit into approximations that isolate the most informative subsets of input components without requiring exhaustive search. In the latter, we localize information about future plastic rearrangement in the static structure of a sheared glass, and find the information to be more or less diffuse depending on the system's preparation. By way of a principled scheme of approximations, the Distributed IB brings much-needed interpretability to deep learning and enables unprecedented analysis of information flow through a system.