音乐音频综合的深层生成模型

论文标题

音乐音频综合的深层生成模型

Deep generative models for musical audio synthesis

论文作者

Huzaifah, M., Wyse, L.

论文摘要

声音建模是开发在参数控制下产生声音的算法的过程。历史上已经开发了一些不同的方法，包括对声音生产和传播的物理进行建模，组装信号生成和处理元素以捕获声学特征，以及操纵记录的音频样品的集合。尽管这些方法中的每一种都能够实现特定应用程序的高质量合成和交互，但它们都是劳动密集型的，并且每种都面临着设计任意控制策略的挑战。最近用于音频综合的生成深度学习系统能够学习模型，这些模型可以穿越由他们训练的数据定义的声音的任意空间。此外，机器学习系统正在为设计这些模型设计控制和导航策略提供新技术。本文是对深度学习的发展的评论，正在改变声音建模的实践。

Sound modelling is the process of developing algorithms that generate sound under parametric control. There are a few distinct approaches that have been developed historically including modelling the physics of sound production and propagation, assembling signal generating and processing elements to capture acoustic features, and manipulating collections of recorded audio samples. While each of these approaches has been able to achieve high-quality synthesis and interaction for specific applications, they are all labour-intensive and each comes with its own challenges for designing arbitrary control strategies. Recent generative deep learning systems for audio synthesis are able to learn models that can traverse arbitrary spaces of sound defined by the data they train on. Furthermore, machine learning systems are providing new techniques for designing control and navigation strategies for these models. This paper is a review of developments in deep learning that are changing the practice of sound modelling.

下载PDF全文

下载文献需遵守相关版权规定

论文标题