通过最大化位熵的深度无监督图像哈希

论文标题

通过最大化位熵的深度无监督图像哈希

Deep Unsupervised Image Hashing by Maximizing Bit Entropy

论文作者

Li, Yunqiang, van Gemert, Jan

论文摘要

无监督的散列对于在没有昂贵的注释的情况下索引巨大的图像或视频收集很重要。哈希旨在学习简短的二元代码，以进行紧凑的存储和有效的语义检索。我们提出了一个无监督的深层散列层，称为Bi-Half Net，以最大化二进制代码的熵。当钻头的两个可能值分布（半半）时，熵是最大的。为了最大化位熵，我们不会在损失函数中添加术语，因为这很难优化和调整。取而代之的是，我们设计了一个新的无参数网络层，以明确强制连续图像特征，以近似最佳的半半分布。该层被证明可以最大程度地减少学到的连续图像特征与最佳半半分布之间的损失项的惩罚项。图像数据集的实验结果FlickR25K，NUS Wide，CIFAR-10，MSCOCO，MNIST和视频数据集UCF-101和HMDB-51表明，我们的方法会导致紧凑的代码并比较与当前的现状。

Unsupervised hashing is important for indexing huge image or video collections without having expensive annotations available. Hashing aims to learn short binary codes for compact storage and efficient semantic retrieval. We propose an unsupervised deep hashing layer called Bi-half Net that maximizes entropy of the binary codes. Entropy is maximal when both possible values of the bit are uniformly (half-half) distributed. To maximize bit entropy, we do not add a term to the loss function as this is difficult to optimize and tune. Instead, we design a new parameter-free network layer to explicitly force continuous image features to approximate the optimal half-half bit distribution. This layer is shown to minimize a penalized term of the Wasserstein distance between the learned continuous image features and the optimal half-half bit distribution. Experimental results on the image datasets Flickr25k, Nus-wide, Cifar-10, Mscoco, Mnist and the video datasets Ucf-101 and Hmdb-51 show that our approach leads to compact codes and compares favorably to the current state-of-the-art.

下载PDF全文

下载文献需遵守相关版权规定

论文标题