论文标题

众包语音清晰度实验的有效数据筛选技术:基于IRM的语音增强的评估

Effective data screening technique for crowdsourced speech intelligibility experiments: Evaluation with IRM-based speech enhancement

论文作者

Yamamoto, Ayako, Irino, Toshio, Araki, Shoko, Arai, Kenichi, Ogawa, Atsunori, Kinoshita, Keisuke, Nakatani, Tomohiro

论文摘要

为了评估开发有效的语音增强和降低降噪算法的客观可理解性措施,必须对人类听众进行语音清晰度(SI)实验至关重要。最近,众包远程测试已成为一种流行的手段,可以在短时间内以相对较小的成本和相对较小的成本收集大量数据。但是,仔细的数据筛选对于获得可靠的SI数据至关重要。我们在良好控制的实验室以及无法直接控制的众包远程环境中,通过“ Oracle”理想比率掩码(IRM)进行了SI实验。我们介绍了简单的音调PIP测试,其中要求参与者报告可听见音调的数量,以估算其听力水平以上的听觉阈值。音调PIP测试对于数据筛选非常有效,可以减少众包远程结果的可变性,从而使实验室结果变得相似。结果还证明了Oracle IRM的SI,为我们提供了基于面具的单渠道语音增强的上限。

It is essential to perform speech intelligibility (SI) experiments with human listeners in order to evaluate objective intelligibility measures for developing effective speech enhancement and noise reduction algorithms. Recently, crowdsourced remote testing has become a popular means for collecting a massive amount and variety of data at a relatively small cost and in a short time. However, careful data screening is essential for attaining reliable SI data. We performed SI experiments on speech enhanced by an "oracle" ideal ratio mask (IRM) in a well-controlled laboratory and in crowdsourced remote environments that could not be controlled directly. We introduced simple tone pip tests, in which participants were asked to report the number of audible tone pips, to estimate their listening levels above audible thresholds. The tone pip tests were very effective for data screening to reduce the variability of crowdsourced remote results so that the laboratory results would become similar. The results also demonstrated the SI of an oracle IRM, giving us the upper limit of the mask-based single-channel speech enhancement.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源