论文标题
具有内置的交叉注意对齐方式的深层模型
Deep model with built-in cross-attention alignment for acoustic echo cancellation
论文作者
论文摘要
随着最近的研究进展,深度学习模型已成为实时电信应用程序中声学回声取消(AEC)的吸引人选择。由于声学回声是音频质量差的主要来源之一,因此提出了各种各样的深层模型。但是,对良好回声取消质量的重要但经常忽略的要求是麦克风和远端信号的同步。通常,使用基于互相关的经典算法实现,对齐模块是具有已知设计限制的单独功能块。在我们的工作中,我们提出了一个深入学习的架构,该体系结构具有内置的基于自我注意的一致性,该架构能够处理不结盟的输入,从而改善了回声取消性能,同时简化了通信管道。此外,我们表明我们的方法可以在AEC挑战数据集中的真实记录上进行困难的延迟估计案例实现重大改进。
With recent research advances, deep learning models have become an attractive choice for acoustic echo cancellation (AEC) in real-time teleconferencing applications. Since acoustic echo is one of the major sources of poor audio quality, a wide variety of deep models have been proposed. However, an important but often omitted requirement for good echo cancellation quality is the synchronization of the microphone and far end signals. Typically implemented using classical algorithms based on cross-correlation, the alignment module is a separate functional block with known design limitations. In our work we propose a deep learning architecture with built-in self-attention based alignment, which is able to handle unaligned inputs, improving echo cancellation performance while simplifying the communication pipeline. Moreover, we show that our approach achieves significant improvements for difficult delay estimation cases on real recordings from AEC Challenge data set.