通过有效地发现预测的最近邻居来改善局部敏感的哈西

论文标题

通过有效地发现预测的最近邻居来改善局部敏感的哈西

Improving Locality Sensitive Hashing by Efficiently Finding Projected Nearest Neighbors

论文作者

Jafari, Omid, Nagarkar, Parth, Montaño, Jonathan

论文摘要

对于许多多媒体应用程序，高维空间中的相似性搜索是一项重要任务。由于臭名昭著的维度诅咒，大约最近的邻居技术比精确的搜索技术更优选，因为它们可以以更好的速度返回足够好的结果。局部敏感的哈希（LSH）是一种非常流行的随机哈希技术，用于查找大约最近的邻居。现有的最新位置敏感散列技术着重于改善整体过程的性能，主要集中于最大程度地减少iOS总数，同时牺牲整体处理时间。 LSH技术中的主要时间耗尽过程是在投影空间中找到相邻点的过程。我们提出了一种新型的索引结构，称为半径优化的位置敏感的哈希（Rolsh）。借助采样技术和神经网络，我们提出了两种技术，可以有效地找到投影空间中的相邻点，而无需牺牲结果的准确性。我们对实际数据集的广泛实验分析表明，罗尔什（Rolsh）比现有最新的LSH技术的性能优势。

Similarity search in high-dimensional spaces is an important task for many multimedia applications. Due to the notorious curse of dimensionality, approximate nearest neighbor techniques are preferred over exact searching techniques since they can return good enough results at a much better speed. Locality Sensitive Hashing (LSH) is a very popular random hashing technique for finding approximate nearest neighbors. Existing state-of-the-art Locality Sensitive Hashing techniques that focus on improving performance of the overall process, mainly focus on minimizing the total number of IOs while sacrificing the overall processing time. The main time-consuming process in LSH techniques is the process of finding neighboring points in projected spaces. We present a novel index structure called radius-optimized Locality Sensitive Hashing (roLSH). With the help of sampling techniques and Neural Networks, we present two techniques to find neighboring points in projected spaces efficiently, without sacrificing the accuracy of the results. Our extensive experimental analysis on real datasets shows the performance benefit of roLSH over existing state-of-the-art LSH techniques.

下载PDF全文

下载文献需遵守相关版权规定

论文标题