与双信号转换LSTM网络取消声音回声

论文标题

与双信号转换LSTM网络取消声音回声

Acoustic echo cancellation with the dual-signal transformation LSTM network

论文作者

Westhausen, Nils L., Meyer, Bernd T.

论文摘要

本文将双信号转换LSTM网络（DTLN）应用于实时声音回声取消（AEC）的任务。 DTLN在堆叠的网络方法中结合了短期傅立叶变换和学习的功能表示形式，该方法可以在时间频率和时域中进行强大的信息处理，其中还包括阶段信息。该模型仅在真实和合成回声场景的60〜h训练中训练。培训设置包括多语言语音，数据增强，额外的噪音和混响，以创建一个模型，该模型应该很好地推广到各种各样的现实情况。 DTLN方法在清洁和嘈杂的回声条件下产生最先进的性能，可降低声学回声和额外的噪音。就平均意见评分（MOS）而言，该方法的表现将AEC-Challenge基线优于0.30。

This paper applies the dual-signal transformation LSTM network (DTLN) to the task of real-time acoustic echo cancellation (AEC). The DTLN combines a short-time Fourier transformation and a learned feature representation in a stacked network approach, which enables robust information processing in the time-frequency and in the time domain, which also includes phase information. The model is only trained on 60~h of real and synthetic echo scenarios. The training setup includes multi-lingual speech, data augmentation, additional noise and reverberation to create a model that should generalize well to a large variety of real-world conditions. The DTLN approach produces state-of-the-art performance on clean and noisy echo conditions reducing acoustic echo and additional noise robustly. The method outperforms the AEC-Challenge baseline by 0.30 in terms of Mean Opinion Score (MOS).

下载PDF全文

下载文献需遵守相关版权规定

论文标题