论文标题
体育中事件序列的监督顺序模式挖掘,以识别重要的游戏模式:橄榄球联盟的应用
Supervised sequential pattern mining of event sequences in sport to identify important patterns of play: an application to rugby union
论文作者
论文摘要
给定一组由时序事件组成的序列,顺序模式挖掘对于识别来自不同序列或在同一序列中的频繁子序列很有用。但是,在运动中,这些技术无法确定特定游戏模式对好是坏成果的重要性,这对于教练和绩效分析师来说通常会更大。在这项研究中,我们将最近提议的有监督的顺序模式采矿算法称为“安全图案修剪”(SPP),以代表2018年日本顶级联赛中一支橄榄球队比赛中比赛的标签序列。我们比较了SPP验证的模式,这些模式是从团队和反对派团队的观点中得分和非分记录结果之间最歧视的模式,当将无人监督的顺序模式挖掘算法应用于原始数据集的子集(在标签上)时,获得了最常见的模式。我们获得的结果发现,领队,成功的排队,重新获得比赛,反复出战的比赛以及反对派球队失败的退出比赛被确定为歧视球队得分且不得分的模式。反对派团队的线突破,球队犯错,反对派团队的阵容以及反对派团队反复出战的比赛被确定为歧视反对派球队得分和不得分的模式。还发现,由于其监督性质以及其修剪和安全筛查的特性,SPP获得了比无监督的模型更复杂的多种模式,而这些模型可能对教练和绩效分析师更具实用性。
Given a set of sequences comprised of time-ordered events, sequential pattern mining is useful to identify frequent subsequences from different sequences or within the same sequence. However, in sport, these techniques cannot determine the importance of particular patterns of play to good or bad outcomes, which is often of greater interest to coaches and performance analysts. In this study, we apply a recently proposed supervised sequential pattern mining algorithm called safe pattern pruning (SPP) to 490 labelled event sequences representing passages of play from one rugby team's matches from the 2018 Japan Top League. We compare the SPP-obtained patterns that are the most discriminative between scoring and non-scoring outcomes from both the team's and opposition teams' perspectives, with the most frequent patterns obtained with well-known unsupervised sequential pattern mining algorithms when applied to subsets of the original dataset, split on the label. Our obtained results found that linebreaks, successful lineouts, regained kicks in play, repeated phase-breakdown play, and failed exit plays by the opposition team were identified as as the patterns that discriminated most between the team scoring and not scoring. Opposition team linebreaks, errors made by the team, opposition team lineouts, and repeated phase-breakdown play by the opposition team were identified as the patterns that discriminated most between the opposition team scoring and not scoring. It was also found that, by virtue of its supervised nature as well as its pruning and safe-screening properties, SPP obtained a greater variety of generally more sophisticated patterns than the unsupervised models, which are likely to be of more utility to coaches and performance analysts.