论文标题

利用伪标记的数据来改善直接语音到语音翻译

Leveraging Pseudo-labeled Data to Improve Direct Speech-to-Speech Translation

论文作者

Dong, Qianqian, Yue, Fengpeng, Ko, Tom, Wang, Mingxuan, Bai, Qibing, Zhang, Yu

论文摘要

直接语音到语音翻译(S2ST)最近引起了越来越多的关注。由于数据稀缺和复杂的语音到语音映射,该任务非常具有挑战性。在本文中,我们报告了我们在S2ST中的最新成就。首先,我们构建了一个S2ST变压器基线,该基线的表现优于原始翻译。其次,我们通过伪标记来利用外部数据,并在Fisher英语对西班牙测试集中获得新的最新结果。实际上,我们使用流行技术的组合来利用伪数据,这些技术在应用于S2ST时并非微不足道。此外,我们在句法相似(西班牙语)和遥远(英语)语言对上评估了我们的方法。我们的实施可从https://github.com/fengpeng-yue/speech-speech-translation获得。

Direct Speech-to-speech translation (S2ST) has drawn more and more attention recently. The task is very challenging due to data scarcity and complex speech-to-speech mapping. In this paper, we report our recent achievements in S2ST. Firstly, we build a S2ST Transformer baseline which outperforms the original Translatotron. Secondly, we utilize the external data by pseudo-labeling and obtain a new state-of-the-art result on the Fisher English-to-Spanish test set. Indeed, we exploit the pseudo data with a combination of popular techniques which are not trivial when applied to S2ST. Moreover, we evaluate our approach on both syntactically similar (Spanish-English) and distant (English-Chinese) language pairs. Our implementation is available at https://github.com/fengpeng-yue/speech-to-speech-translation.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源