论文标题

在小型会议上,基于多通道麦克风阵列的扬声器诊断

Speaker Diarization Based on Multi-channel Microphone Array in Small-scale Meeting

论文作者

Du, Yuxuan, Zhou, Ruohua

论文摘要

在说话者诊断的任务中,小规模会议的数量占很大比例。当将麦克风阵列用作录制设备时,大多数研究人员通常会忽略其空间信息。在本文中,我们受到将D-Vector和麦克风阵列空间矢量结合的聚类方法的启发,我们提出了一种使用多渠道麦克风阵列的诊断方法,用于与不超过4个说话者的会议。我们利用语音增强来预处理麦克风阵列的音频。采用转向反应功率阶段转换(SRP-PHAT)算法来获得更准确的扬声器,并应用扬声器数量来重新集结语音段以实现更好的性能。最后,我们通过Dover-Lap融合系统,以获得最佳结果。我们在AMI语料库上评估了系统。与到目前为止的最佳实验结果相比,我们的系统在很大程度上提高了诊断错误率(DER)。

In the task of speaker diarization, the number of small-scale meetings accounts for a large proportion. When microphone arrays are employed as a recording device, its spatial information is usually ignored by most researchers. In this paper, inspired by the clustering method combining d-vector and microphone array spatial vector, we proposed a diarization method which using multi-channel microphone arrays for a meeting with no more than 4 speakers. We utilize speech enhancement to preprocess the audio from the microphone array. The Steered-Response Power Phase Transform (SRP-PHAT) algorithm are employed to get more accurate speakers, and apply the number of speakers to recluster the speech segments to achieve better performance. Finally, we fuse our system by DOVER-LAP to get the best result. We evaluated our system on the AMI corpus. Compared with the best experimental results so far, our system has achieved largely improvement in the diarization error rate (DER).

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源