轨迹测试训练在Next-Location预测数据集中重叠

论文标题

轨迹测试训练在Next-Location预测数据集中重叠

Trajectory Test-Train Overlap in Next-Location Prediction Datasets

论文作者

Luca, Massimiliano, Pappalardo, Luca, Lepri, Bruno, Barlacchi, Gianni

论文摘要

下一个地点预测，包括预测用户的位置，鉴于其历史轨迹，在城市规划，地理营销和疾病扩散等多个领域都具有重要意义。在过去的几年中，已经提出了一些预测因素来解决它，包括基于深度学习的最后一代。本文测试了这些预测因子在公共移动性数据集上的概括能力，从而通过测试集中的轨迹在培训集中表现出完全还是部分出现，从而对数据集进行了分层。我们一直在所有分析的数据集中发现一个严重的轨迹重叠问题，突出显示了预测轨迹的轨迹，同时具有有限的概括能力。因此，我们提出了一种基于空间迁移率模式的下一个线路预测因子的输出的方法。借助这些技术，我们可以显着提高预测指标的概括能力，其准确性相对提高了无法记住的轨迹的准确性高达96.15％（即与训练集的重叠率低）。

Next-location prediction, consisting of forecasting a user's location given their historical trajectories, has important implications in several fields, such as urban planning, geo-marketing, and disease spreading. Several predictors have been proposed in the last few years to address it, including last-generation ones based on deep learning. This paper tests the generalization capability of these predictors on public mobility datasets, stratifying the datasets by whether the trajectories in the test set also appear fully or partially in the training set. We consistently discover a severe problem of trajectory overlapping in all analyzed datasets, highlighting that predictors memorize trajectories while having limited generalization capacities. We thus propose a methodology to rerank the outputs of the next-location predictors based on spatial mobility patterns. With these techniques, we significantly improve the predictors' generalization capability, with a relative improvement on the accuracy up to 96.15% on the trajectories that cannot be memorized (i.e., low overlap with the training set).

下载PDF全文

下载文献需遵守相关版权规定

论文标题