SCA：流媒体交叉发音对齐以回声取消

论文标题

SCA：流媒体交叉发音对齐以回声取消

SCA: Streaming Cross-attention Alignment for Echo Cancellation

论文作者

Liu, Yang, Shi, Yangyang, Li, Yun, Kalgaonkar, Kaustubh, Srinivasan, Sriram, Lei, Xin

论文摘要

端到端的深度学习已显示出令人鼓舞的语音增强任务结果，例如抑制噪声，脊椎和语音分离。但是，大多数回声取消的最新方法是基于DSP的经典或混合DSP-ML算法。诸如延迟估计器和自适应线性滤波器之类的组件基于传统的信号处理概念，而深度学习算法通常仅用于替代非线性残留回声抑制器。本文介绍了带有流跨注意对齐（SCA）的端到端回声取消网络。我们提出的方法可以处理未对齐的输入，而无需外部对齐，并在没有回声的情况下产生高质量的语音。同时，端到端算法简化了当前的回声取消管道，以获取时间变化的回声路径案例。我们在ICASSP2022和Interspeech2021上测试了我们提出的方法Microsoft Deep Deep Echo取消挑战评估数据集，我们的方法在其中优于其他一些混合和端到端方法。

End-to-End deep learning has shown promising results for speech enhancement tasks, such as noise suppression, dereverberation, and speech separation. However, most state-of-the-art methods for echo cancellation are either classical DSP-based or hybrid DSP-ML algorithms. Components such as the delay estimator and adaptive linear filter are based on traditional signal processing concepts, and deep learning algorithms typically only serve to replace the non-linear residual echo suppressor. This paper introduces an end-to-end echo cancellation network with a streaming cross-attention alignment (SCA). Our proposed method can handle unaligned inputs without requiring external alignment and generate high-quality speech without echoes. At the same time, the end-to-end algorithm simplifies the current echo cancellation pipeline for time-variant echo path cases. We test our proposed method on the ICASSP2022 and Interspeech2021 Microsoft deep echo cancellation challenge evaluation dataset, where our method outperforms some of the other hybrid and end-to-end methods.

下载PDF全文

下载文献需遵守相关版权规定

论文标题