论文标题

移动音频流网络有效的低延迟语音增强

Efficient Low-Latency Speech Enhancement with Mobile Audio Streaming Networks

论文作者

Romaniuk, Michał, Masztalski, Piotr, Piaskowski, Karol, Matuszewski, Mateusz

论文摘要

我们建议移动音频流网络(MASNET),以提高有效的低延迟语音,这特别适用于移动设备和其他计算能力是限制的应用程序。 MASNET处理线性尺度频谱图,将连续的嘈杂框架转化为复杂值掩模,然后将其应用于相应的嘈杂框架。 MASNET可以在低延迟的增量推理模式下进行操作,该模式与层面批处理模式的复杂性相匹配。与类似的全趋验结构相比,MasNet融合了深度和尖锐的卷积,以大幅度减少融合的多重蓄电池每秒(FMA/s),而SNR的成本有所减少。

We propose Mobile Audio Streaming Networks (MASnet) for efficient low-latency speech enhancement, which is particularly suitable for mobile devices and other applications where computational capacity is a limitation. MASnet processes linear-scale spectrograms, transforming successive noisy frames into complex-valued ratio masks which are then applied to the respective noisy frames. MASnet can operate in a low-latency incremental inference mode which matches the complexity of layer-by-layer batch mode. Compared to a similar fully-convolutional architecture, MASnet incorporates depthwise and pointwise convolutions for a large reduction in fused multiply-accumulate operations per second (FMA/s), at the cost of some reduction in SNR.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源