在对抗学习的分离器上进行混合周期一致的学习，以进行准确稳定的无监督语音分离

论文标题

在对抗学习的分离器上进行混合周期一致的学习，以进行准确稳定的无监督语音分离

Remix-cycle-consistent Learning on Adversarially Learned Separator for Accurate and Stable Unsupervised Speech Separation

论文作者

Saijo, Kohei, Ogawa, Tetsuji

论文摘要

语音分离网络的一种新的学习算法旨在以无监督的方式以分离信号的方式显式减少残余噪声和伪影。当观察到的信号的地面真相无法访问时，已知生成对抗网络在构建分离网络方面有效。尽管如此，旨在分配到分布映射的弱目标使学习变得不稳定并限制其表现。这项研究介绍了混音周期矛盾的损失，作为更合适的目标函数，并使用它来微调对手学习的源分离模型。混音周期的一致性损失被定义为在麦克风上观察到的混合语音与通过分离混合声音并将其输出与另一种组合混合的输出混合的过程获得的伪混合语音之间的混合语音之间的差异。这种损失的最小化导致分离网络输出中的扭曲显式减少。与多通道语音分离的实验比较表明，所提出的方法达到了高分离精度，学习稳定性与监督学习相当。

A new learning algorithm for speech separation networks is designed to explicitly reduce residual noise and artifacts in the separated signal in an unsupervised manner. Generative adversarial networks are known to be effective in constructing separation networks when the ground truth for the observed signal is inaccessible. Still, weak objectives aimed at distribution-to-distribution mapping make the learning unstable and limit their performance. This study introduces the remix-cycle-consistency loss as a more appropriate objective function and uses it to fine-tune adversarially learned source separation models. The remix-cycle-consistency loss is defined as the difference between the mixed speech observed at microphones and the pseudo-mixed speech obtained by alternating the process of separating the mixed sound and remixing its outputs with another combination. The minimization of this loss leads to an explicit reduction in the distortions in the output of the separation network. Experimental comparisons with multichannel speech separation demonstrated that the proposed method achieved high separation accuracy and learning stability comparable to supervised learning.

下载PDF全文

下载文献需遵守相关版权规定

论文标题