差异：基于生成扩散的STFT期检索

论文标题

差异：基于生成扩散的STFT期检索

DiffPhase: Generative Diffusion-based STFT Phase Retrieval

论文作者

Peer, Tal, Welker, Simon, Gerkmann, Timo

论文摘要

扩散概率模型最近已用于各种任务，包括语音增强和合成。作为一种生成方法，已经证明扩散模型特别适合于插补问题，其中基于现有数据生成丢失的数据。相位检索本质上是一个归合问题，必须根据给定的幅度生成相位信息。在这项工作中，我们基于语音域中的先前工作，改编了专门用于STFT阶段检索的语音增强扩散模型。使用语音质量和清晰度指标进行评估表明，扩散方法非常适合相位检索任务，并且性能超过了经典和现代方法。

Diffusion probabilistic models have been recently used in a variety of tasks, including speech enhancement and synthesis. As a generative approach, diffusion models have been shown to be especially suitable for imputation problems, where missing data is generated based on existing data. Phase retrieval is inherently an imputation problem, where phase information has to be generated based on the given magnitude. In this work we build upon previous work in the speech domain, adapting a speech enhancement diffusion model specifically for STFT phase retrieval. Evaluation using speech quality and intelligibility metrics shows the diffusion approach is well-suited to the phase retrieval task, with performance surpassing both classical and modern methods.

下载PDF全文

下载文献需遵守相关版权规定

论文标题