论文标题

通过对齐正规化和数据增强来学习健壮和不变的表示

Toward Learning Robust and Invariant Representations with Alignment Regularization and Data Augmentation

论文作者

Wang, Haohan, Huang, Zeyi, Wu, Xindi, Xing, Eric P.

论文摘要

事实证明,数据增强是开发机器学习模型的有效技术,这些技术对已知的分配变化类别(例如,图像的旋转)具有强大的态度,而对齐正则化是一种与数据增强一起使用的技术,可以进一步帮助该模型学习对增加数据的变化的模型不变。在本文中,以对齐正规的选择的扩散为动机,我们试图评估沿鲁棒性和不变性维度的几种流行设计选择的性能,为此我们介绍了新的测试程序。我们的合成实验结果表达了平方L2规范正则化的好处。此外,我们还正式分析了对齐正规化的行为,以补充我们认为现实的假设下的经验研究。最后,我们测试了我们识别的这种简单技术(使用平方L2规范对齐正则化的最差数据增强),并表明该方法的好处超出了特殊设计的方法的好处。我们还在Tensorflow和Pytorch中发布了一个软件包,以供用户在https://github.com/jyanln/alignreg上使用几行的方法。

Data augmentation has been proven to be an effective technique for developing machine learning models that are robust to known classes of distributional shifts (e.g., rotations of images), and alignment regularization is a technique often used together with data augmentation to further help the model learn representations invariant to the shifts used to augment the data. In this paper, motivated by a proliferation of options of alignment regularizations, we seek to evaluate the performances of several popular design choices along the dimensions of robustness and invariance, for which we introduce a new test procedure. Our synthetic experiment results speak to the benefits of squared l2 norm regularization. Further, we also formally analyze the behavior of alignment regularization to complement our empirical study under assumptions we consider realistic. Finally, we test this simple technique we identify (worst-case data augmentation with squared l2 norm alignment regularization) and show that the benefits of this method outrun those of the specially designed methods. We also release a software package in both TensorFlow and PyTorch for users to use the method with a couple of lines at https://github.com/jyanln/AlignReg.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源