论文标题

相对平坦和概括

Relative Flatness and Generalization

论文作者

Petzka, Henning, Kamp, Michael, Adilova, Linara, Sminchisescu, Cristian, Boley, Mario

论文摘要

损失曲线的平坦度被认为是与机器学习模型(特别是神经网络)的概括能力相关的。虽然经验上已经观察到,平坦度量始终与泛化密切相关,但它仍然是一个开放的理论问题,为什么以及在哪些情况下平坦与概括相关,尤其是鉴于改变某些平坦度量的重新聚体化,但使概括不变。我们通过将其与代表性数据的插值相关联,得出代表性的概念和特征鲁棒性来研究平坦度与概括之间的联系。这些概念使我们能够严格连接平坦和概括,并确定连接所保持的条件。此外,它们产生了一种与概括密切相关的新颖但自然的相对平坦度量,简化了普通最小二乘的脊回归,并解决了重疗法问题。

Flatness of the loss curve is conjectured to be connected to the generalization ability of machine learning models, in particular neural networks. While it has been empirically observed that flatness measures consistently correlate strongly with generalization, it is still an open theoretical problem why and under which circumstances flatness is connected to generalization, in particular in light of reparameterizations that change certain flatness measures but leave generalization unchanged. We investigate the connection between flatness and generalization by relating it to the interpolation from representative data, deriving notions of representativeness, and feature robustness. The notions allow us to rigorously connect flatness and generalization and to identify conditions under which the connection holds. Moreover, they give rise to a novel, but natural relative flatness measure that correlates strongly with generalization, simplifies to ridge regression for ordinary least squares, and solves the reparameterization issue.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源