论文标题

SmiLetrack:闭塞感知多对象跟踪的相似性学习

SMILEtrack: SiMIlarity LEarning for Occlusion-Aware Multiple Object Tracking

论文作者

Wang, Yu-Hsiang, Hsieh, Jun-Wei, Chen, Ping-Yang, Chang, Ming-Ching, So, Hung Hin, Li, Xin

论文摘要

尽管在多个对象跟踪(MOT)中最近进行了进展,但诸如遮挡,相似对象和复杂场景之类的几个障碍仍然是一个开放的挑战。同时,仍然缺乏对流行的逐个检测范式的成本绩效折衷的系统研究。本文介绍了SmiLetrack,这是一种创新的对象跟踪器,通过将有效的对象检测器与基于暹罗网络的相似性学习模块(SLM)整合到有效地解决这些挑战。 Smiletrack的技术贡献是双重的。首先,我们提出了一个SLM,该SLM计算两个对象之间的外观相似性,从而克服了单独检测和嵌入(SDE)模型中特征描述符的局限性。 SLM结合了受视觉变压器启发的贴片自我发注意(PSA)块,该块为准确的相似性匹配而生成可靠的功能。其次,我们开发了一个相似性匹配的级联模块(SMC)模块,并具有新型的门功能,可在连续的视频帧中匹配可靠的对象,从而进一步增强了MOT性能。这些创新共同有助于SmiLetrack在成本({\ em,例如跑步速度)和性能(例如跟踪准确性)之间在几种现有的最新基准测试(包括流行的Bytetrack方法)上实现了改进的权衡。 SmiLetrack在MOT17和MOT20数据集上胜过0.4-0.8 MOTA和2.1-2.2 HOTA点。代码可从https://github.com/pingyang1117/smiletrack_official获得

Despite recent progress in Multiple Object Tracking (MOT), several obstacles such as occlusions, similar objects, and complex scenes remain an open challenge. Meanwhile, a systematic study of the cost-performance tradeoff for the popular tracking-by-detection paradigm is still lacking. This paper introduces SMILEtrack, an innovative object tracker that effectively addresses these challenges by integrating an efficient object detector with a Siamese network-based Similarity Learning Module (SLM). The technical contributions of SMILETrack are twofold. First, we propose an SLM that calculates the appearance similarity between two objects, overcoming the limitations of feature descriptors in Separate Detection and Embedding (SDE) models. The SLM incorporates a Patch Self-Attention (PSA) block inspired by the vision Transformer, which generates reliable features for accurate similarity matching. Second, we develop a Similarity Matching Cascade (SMC) module with a novel GATE function for robust object matching across consecutive video frames, further enhancing MOT performance. Together, these innovations help SMILETrack achieve an improved trade-off between the cost ({\em e.g.}, running speed) and performance (e.g., tracking accuracy) over several existing state-of-the-art benchmarks, including the popular BYTETrack method. SMILETrack outperforms BYTETrack by 0.4-0.8 MOTA and 2.1-2.2 HOTA points on MOT17 and MOT20 datasets. Code is available at https://github.com/pingyang1117/SMILEtrack_Official

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源