论文标题

在非平行语音转换中转移源样式

Transferring Source Style in Non-Parallel Voice Conversion

论文作者

Liu, Songxiang, Cao, Yuewen, Kang, Shiyin, Hu, Na, Liu, Xunying, Su, Dan, Yu, Dong, Meng, Helen

论文摘要

语音转换(VC)技术旨在在保留潜在的语言信息的同时修改说话者的扬声器身份。大多数VC方法都忽略了对说话风格的建模(例如,情感和重点),这可能包含说话者故意添加的因素,应在转换过程中保留。这项研究提出了一种基于序列至序列的非平行VC方法,该方法具有通过明确建模将语言样式从源语音转移到转换语音的能力。客观的评估和主观听力测试在言语自然性方面表明了拟议的VC方法的优越性,并且会转换后的言语相似性。还进行了实验以显示所提出方法的源式可传递性。

Voice conversion (VC) techniques aim to modify speaker identity of an utterance while preserving the underlying linguistic information. Most VC approaches ignore modeling of the speaking style (e.g. emotion and emphasis), which may contain the factors intentionally added by the speaker and should be retained during conversion. This study proposes a sequence-to-sequence based non-parallel VC approach, which has the capability of transferring the speaking style from the source speech to the converted speech by explicitly modeling. Objective evaluation and subjective listening tests show superiority of the proposed VC approach in terms of speech naturalness and speaker similarity of the converted speech. Experiments are also conducted to show the source-style transferability of the proposed approach.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源