Hypresound：使用超网络产生音频信号的隐式神经表示

论文标题

Hypresound：使用超网络产生音频信号的隐式神经表示

HyperSound: Generating Implicit Neural Representations of Audio Signals with Hypernetworks

论文作者

Szatkowski, Filip, Piczak, Karol J., Spurek, Przemysław, Tabor, Jacek, Trzciński, Tomasz

论文摘要

隐式神经表示（INRS）是一个快速增长的研究领域，它提供了代表多媒体信号的替代方法。 INR的最新应用包括图像超分辨率，高维信号的压缩或3D渲染。但是，这些解决方案通常集中在视觉数据上，并且将它们调整到音频域并不是微不足道的。此外，它需要为每个数据样本进行单独训练的模型。为了解决这一限制，我们提出了Hypersound，这是一种元学习方法，利用超网络来生产训练时看不见的音频信号。我们表明，我们的方法可以用与其他最先进的模型相当的质量来重建声波。

Implicit neural representations (INRs) are a rapidly growing research field, which provides alternative ways to represent multimedia signals. Recent applications of INRs include image super-resolution, compression of high-dimensional signals, or 3D rendering. However, these solutions usually focus on visual data, and adapting them to the audio domain is not trivial. Moreover, it requires a separately trained model for every data sample. To address this limitation, we propose HyperSound, a meta-learning method leveraging hypernetworks to produce INRs for audio signals unseen at training time. We show that our approach can reconstruct sound waves with quality comparable to other state-of-the-art models.

下载PDF全文

下载文献需遵守相关版权规定

论文标题