在全球协方差池的特征值中，以进行细粒度的视觉识别

论文标题

在全球协方差池的特征值中，以进行细粒度的视觉识别

On the Eigenvalues of Global Covariance Pooling for Fine-grained Visual Recognition

论文作者

Song, Yue, Sebe, Nicu, Wang, Wei

论文摘要

细粒度的视觉分类（FGVC）很具有挑战性，因为很难捕获细微的类别变化。一个值得注意的研究线使用全球协方差池（GCP）层来学习具有二阶统计数据的强大表示形式，这可以有效地模拟类间差异。在我们以前的会议论文中，我们表明，GCP协方差的小型特征值可以达到更轻松的梯度并提高大规模基准的性能。但是，在细粒的数据集上，截断小特征值将使模型无法收敛。这一观察结果与一个共同的假设相矛盾，即小型特征值仅与嘈杂和不重要的信息相对应。因此，忽略他们应该对表现影响很小。为了诊断这种特殊的行为，我们提出了两种归因方法，它们的可视化表明看似并不重要的小特征值至关重要，因为它们负责提取歧视性类别特异性特征。受到这一观察的启发，我们提出了一个专门用于放大小特征值的重要性的网络分支。在不引入任何其他参数的情况下，该分支简单地放大了小的特征值，并在三个细粒基准上实现了GCP方法的最新性能。此外，与较大数据集上的其他FGVC方法相比，该性能还具有竞争力。代码可在\ href {https://github.com/kingjamessong/differentiablesvd} {https://github.com/kingjamessong/differentiablesvd}中获得。

The Fine-Grained Visual Categorization (FGVC) is challenging because the subtle inter-class variations are difficult to be captured. One notable research line uses the Global Covariance Pooling (GCP) layer to learn powerful representations with second-order statistics, which can effectively model inter-class differences. In our previous conference paper, we show that truncating small eigenvalues of the GCP covariance can attain smoother gradient and improve the performance on large-scale benchmarks. However, on fine-grained datasets, truncating the small eigenvalues would make the model fail to converge. This observation contradicts the common assumption that the small eigenvalues merely correspond to the noisy and unimportant information. Consequently, ignoring them should have little influence on the performance. To diagnose this peculiar behavior, we propose two attribution methods whose visualizations demonstrate that the seemingly unimportant small eigenvalues are crucial as they are in charge of extracting the discriminative class-specific features. Inspired by this observation, we propose a network branch dedicated to magnifying the importance of small eigenvalues. Without introducing any additional parameters, this branch simply amplifies the small eigenvalues and achieves state-of-the-art performances of GCP methods on three fine-grained benchmarks. Furthermore, the performance is also competitive against other FGVC approaches on larger datasets. Code is available at \href{https://github.com/KingJamesSong/DifferentiableSVD}{https://github.com/KingJamesSong/DifferentiableSVD}.

下载PDF全文

下载文献需遵守相关版权规定

论文标题