论文标题

有效的架构寻找各种任务

Efficient Architecture Search for Diverse Tasks

论文作者

Shen, Junhong, Khodak, Mikhail, Talwalkar, Ameet

论文摘要

尽管神经体系结构搜索(NAS)已启用了对经过精心研究的区域的自动化机器学习(AUTOML),但其在计算机视觉之外的任务中的应用仍未得到探索。由于研究较少的领域正是我们期望汽车会产生最大影响的那些域,因此在这项工作中,我们研究了NAS,以有效解决各种问题。寻求一种快速,简单且广泛适用的方法,我们修复了标准的卷积网络(CNN)拓扑,并建议搜索合适的内核大小和其操作的扩张。这大大扩展了模型在多个分辨率上提取不同类型数据的功能的能力,同时仅需要在操作空间上进行搜索。为了克服此搜索空间中幼稚的体重分享的效率挑战,我们引入了DASH,这是一种可区分的NAS算法,该算法使用卷积的傅立叶对角线化计算运行的混合物,既可以实现更好的渐近复杂性,又实现了最新的搜索时间。我们评估了跨越各种应用领域的十项任务的破折号,例如求解PDE,蛋白质折叠和心脏病检测。 DASH总体上优于最先进的汽车方法,在七个任务上达到了最著名的自动化性能。同时,在十个任务中的六个任务中,组合的搜索和再培训时间慢于2倍,而不是简单地训练准确的CNN骨架。

While neural architecture search (NAS) has enabled automated machine learning (AutoML) for well-researched areas, its application to tasks beyond computer vision is still under-explored. As less-studied domains are precisely those where we expect AutoML to have the greatest impact, in this work we study NAS for efficiently solving diverse problems. Seeking an approach that is fast, simple, and broadly applicable, we fix a standard convolutional network (CNN) topology and propose to search for the right kernel sizes and dilations its operations should take on. This dramatically expands the model's capacity to extract features at multiple resolutions for different types of data while only requiring search over the operation space. To overcome the efficiency challenges of naive weight-sharing in this search space, we introduce DASH, a differentiable NAS algorithm that computes the mixture-of-operations using the Fourier diagonalization of convolution, achieving both a better asymptotic complexity and an up-to-10x search time speedup in practice. We evaluate DASH on ten tasks spanning a variety of application domains such as PDE solving, protein folding, and heart disease detection. DASH outperforms state-of-the-art AutoML methods in aggregate, attaining the best-known automated performance on seven tasks. Meanwhile, on six of the ten tasks, the combined search and retraining time is less than 2x slower than simply training a CNN backbone that is far less accurate.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源