论文标题
Interspeech 2020远场演讲者验证挑战
The INTERSPEECH 2020 Far-Field Speaker Verification Challenge
论文作者
论文摘要
Interspeech 2020远场扬声器验证挑战(FFSVC 2020)解决了定义明确的条件下的三个不同的研究问题:远场与文本依赖的扬声器验证,来自单个麦克风阵列,来自单个麦波恩阵列的远距离文本的扬声器验证,来自单个微观阵列,以及来自分布的奇异粒子的远场式扬声器验证。这三个任务对参与者构成了跨通道挑战。为了模拟现实生活中的情况,从近语手机中记录了注册语音,而测试话语是从远场麦克风阵列中记录的。在本文中,我们描述了数据库,挑战和基线系统,该系统基于具有余弦相似性评分的基于重新连接的深扬声器网络。对于给定的话语,将不同频道的扬声器嵌入与最终嵌入一样。基线系统的MIDCF分别为0.62、0.66和0.64,EERS为6.27%,6.55%和7.18%,分别为任务1,任务2和任务3。
The INTERSPEECH 2020 Far-Field Speaker Verification Challenge (FFSVC 2020) addresses three different research problems under well-defined conditions: far-field text-dependent speaker verification from single microphone array, far-field text-independent speaker verification from single microphone array, and far-field text-dependent speaker verification from distributed microphone arrays. All three tasks pose a cross-channel challenge to the participants. To simulate the real-life scenario, the enrollment utterances are recorded from close-talk cellphone, while the test utterances are recorded from the far-field microphone arrays. In this paper, we describe the database, the challenge, and the baseline system, which is based on a ResNet-based deep speaker network with cosine similarity scoring. For a given utterance, the speaker embeddings of different channels are equally averaged as the final embedding. The baseline system achieves minDCFs of 0.62, 0.66, and 0.64 and EERs of 6.27%, 6.55%, and 7.18% for task 1, task 2, and task 3, respectively.