通过转移学习来利用喷气式标记者的普遍性

论文标题

通过转移学习来利用喷气式标记者的普遍性

Leveraging universality of jet taggers through transfer learning

论文作者

Dreyer, Frédéric A., Grabarczyk, Radosław, Monni, Pier Francesco

论文摘要

通过机器学习技术对增强对象进行标记的一个重大挑战是与训练复杂模型相关的高度计算成本。然而，QCD的普遍性表明，在培训中学习的大量信息对于不同的物理信号和实验设置都是共同的。在本文中，我们探讨了转移学习技术的使用来开发利用这种普遍性的快速和数据效率的喷气标记。我们考虑图形神经网络lundnet和primateLenet，并引入两个处方，将现有标签器转移到基于微调模型的所有权重的新信号中，或者以冻结其中的一小部分。在$ W $ -BOSON和TOP-QUARK标签的情况下，我们发现可以使用较小的数据订单来获得可靠的标签，并具有相应的训练过程加速。此外，在保持训练数据集的大小固定的同时，我们观察到训练的加速度最高为三倍。这提供了一种有希望的途径，以促进对撞机物理实验中使用此类工具。

A significant challenge in the tagging of boosted objects via machine-learning technology is the prohibitive computational cost associated with training sophisticated models. Nevertheless, the universality of QCD suggests that a large amount of the information learnt in the training is common to different physical signals and experimental setups. In this article, we explore the use of transfer learning techniques to develop fast and data-efficient jet taggers that leverage such universality. We consider the graph neural networks LundNet and ParticleNet, and introduce two prescriptions to transfer an existing tagger into a new signal based either on fine-tuning all the weights of a model or alternatively on freezing a fraction of them. In the case of $W$-boson and top-quark tagging, we find that one can obtain reliable taggers using an order of magnitude less data with a corresponding speed-up of the training process. Moreover, while keeping the size of the training data set fixed, we observe a speed-up of the training by up to a factor of three. This offers a promising avenue to facilitate the use of such tools in collider physics experiments.

下载PDF全文

下载文献需遵守相关版权规定

论文标题