论文标题
跨开发人员调整的超参数转移
Hyperparameter Transfer Across Developer Adjustments
论文作者
论文摘要
在开发人员调整机器学习(ML)算法之后,如何自动使用旧的超参数优化(HPO)的结果来加速新的HPO?这个问题提出了一个具有挑战性的问题,因为开发人员的调整可以改变超参数设置表现良好,甚至是超参数搜索空间本身。尽管存在许多方法来利用对先前任务获得的知识,但到目前为止,从以前的开发步骤中获得的知识仍然完全尚未开发。在这项工作中,我们纠正了这种情况,并提出了一个新的研究框架:跨调整(HT-AA)的超参数转移。为了为该研究框架奠定坚实的基础,我们提供了四种简单的HT-AA基线算法和八个基准测试,改变了ML算法的各个方面,它们的超参数搜索空间以及使用的神经体系结构。最好的基线平均而定,并取决于新旧HPO的预算,比没有转移的著名的HPO算法快地达到了给定的1.2--2.6倍。由于HPO是ML开发的关键步骤,但需要广泛的计算资源,因此这种加速将导致更快的开发周期,降低成本和降低的环境影响。为了使ML开发人员现成并促进对HT-AA的未来研究,我们为基准和基准提供了Python套餐。
After developer adjustments to a machine learning (ML) algorithm, how can the results of an old hyperparameter optimization (HPO) automatically be used to speedup a new HPO? This question poses a challenging problem, as developer adjustments can change which hyperparameter settings perform well, or even the hyperparameter search space itself. While many approaches exist that leverage knowledge obtained on previous tasks, so far, knowledge from previous development steps remains entirely untapped. In this work, we remedy this situation and propose a new research framework: hyperparameter transfer across adjustments (HT-AA). To lay a solid foundation for this research framework, we provide four simple HT-AA baseline algorithms and eight benchmarks changing various aspects of ML algorithms, their hyperparameter search spaces, and the neural architectures used. The best baseline, on average and depending on the budgets for the old and new HPO, reaches a given performance 1.2--2.6x faster than a prominent HPO algorithm without transfer. As HPO is a crucial step in ML development but requires extensive computational resources, this speedup would lead to faster development cycles, lower costs, and reduced environmental impacts. To make these benefits available to ML developers off-the-shelf and to facilitate future research on HT-AA, we provide python packages for our baselines and benchmarks.