批次形式：学习探索稳健表示的样本关系学习

论文标题

批次形式：学习探索稳健表示的样本关系学习

BatchFormer: Learning to Explore Sample Relationships for Robust Representation Learning

论文作者

Hou, Zhi, Yu, Baosheng, Tao, Dacheng

论文摘要

尽管深度神经网络取得了成功，但由于数据稀缺问题，例如数据不平衡，看不见的分布和域的转移，深层表示学习仍然存在许多挑战。为了解决上述问题，已经设计了多种方法来以香草方式探索样本关系（即，从输入或损失函数的角度来看），未能探索深层神经网络的内部结构，用于学习与样本关系学习。受到这一点的启发，我们建议使深层神经网络能够从每个迷你批次中学习样本关系的能力。具体而言，我们引入了一个批处理变压器模块或批处理，然后将其应用于每个迷你批次的批处理维度，以隐式探索训练过程中的样本关系。通过这样做，提出的方法可以使不同样本的协作，例如，头等样本还可以为尾部类别学习以进行长尾识别。此外，为了减轻训练和测试之间的差距，我们在培训期间共享有或没有批次的分类器，因此可以在测试过程中删除。我们对十个数据集进行了广泛的实验，并且提出的方法在没有任何铃铛和口哨的不同数据稀缺应用程序上取得了重大改进，包括长尾识别的任务，组成的零声学学习，领域的概括和对比度学习。代码将在https://github.com/zhihou7/batchformer上公开提供。

Despite the success of deep neural networks, there are still many challenges in deep representation learning due to the data scarcity issues such as data imbalance, unseen distribution, and domain shift. To address the above-mentioned issues, a variety of methods have been devised to explore the sample relationships in a vanilla way (i.e., from the perspectives of either the input or the loss function), failing to explore the internal structure of deep neural networks for learning with sample relationships. Inspired by this, we propose to enable deep neural networks themselves with the ability to learn the sample relationships from each mini-batch. Specifically, we introduce a batch transformer module or BatchFormer, which is then applied into the batch dimension of each mini-batch to implicitly explore sample relationships during training. By doing this, the proposed method enables the collaboration of different samples, e.g., the head-class samples can also contribute to the learning of the tail classes for long-tailed recognition. Furthermore, to mitigate the gap between training and testing, we share the classifier between with or without the BatchFormer during training, which can thus be removed during testing. We perform extensive experiments on over ten datasets and the proposed method achieves significant improvements on different data scarcity applications without any bells and whistles, including the tasks of long-tailed recognition, compositional zero-shot learning, domain generalization, and contrastive learning. Code will be made publicly available at https://github.com/zhihou7/BatchFormer.

下载PDF全文

下载文献需遵守相关版权规定

论文标题