论文标题
学习有效检索连续时间事件序列的时间点过程
Learning Temporal Point Processes for Efficient Retrieval of Continuous Time Event Sequences
论文作者
论文摘要
使用标记的时间点过程(MTPP)的预测建模中的最新发展使得对涉及连续时间事件序列(CTESS)的几种现实世界应用程序进行了准确表征。但是,此类序列的检索问题在文献中基本上仍未得到解决。为了解决这个问题,我们提出了Neuroseqret,该神经菌学会从大型序列中检索并对给定查询序列进行相关的连续时间事件序列进行排名。更具体地说,Neuroseqret首先在查询序列上应用了可训练的不良功能,这使其与语料库序列可比,尤其是当相关的查询 - corpus对具有单独的属性时。接下来,它将不破坏的查询序列和语料库序列馈送到MTPP引导的神经相关模型中。我们开发了相关模型的两个变体,它们在准确性和效率之间提供了权衡。我们还提出了一个优化框架,以从相关得分中学习二进制序列嵌入,适用于对位置敏感的散列,从而导致给定查询序列的top-k返回TOP-K结果。我们使用多个数据集进行的实验表明,神经溶质的精确度超过了几个基础,以及我们的哈希机制的功效。
Recent developments in predictive modeling using marked temporal point processes (MTPP) have enabled an accurate characterization of several real-world applications involving continuous-time event sequences (CTESs). However, the retrieval problem of such sequences remains largely unaddressed in literature. To tackle this, we propose NEUROSEQRET which learns to retrieve and rank a relevant set of continuous-time event sequences for a given query sequence, from a large corpus of sequences. More specifically, NEUROSEQRET first applies a trainable unwarping function on the query sequence, which makes it comparable with corpus sequences, especially when a relevant query-corpus pair has individually different attributes. Next, it feeds the unwarped query sequence and the corpus sequence into MTPP guided neural relevance models. We develop two variants of the relevance model which offer a tradeoff between accuracy and efficiency. We also propose an optimization framework to learn binary sequence embeddings from the relevance scores, suitable for the locality-sensitive hashing leading to a significant speedup in returning top-K results for a given query sequence. Our experiments with several datasets show the significant accuracy boost of NEUROSEQRET beyond several baselines, as well as the efficacy of our hashing mechanism.