论文标题

从语音信号中检测闭合闭合瞬间:定量综述

Detection of Glottal Closure Instants from Speech Signals: a Quantitative Review

论文作者

Drugman, Thomas, Thomas, Mark, Gudnason, Jon, Naylor, Patrick, Dutoit, Thierry

论文摘要

可以在几种语音处理应用中利用声音语音的伪周期性。但是,这就要求可以使用Glottal Cloture Instants(GCIS)的确切位置。本文的重点是对直接从语音波形直接检测GCI的自动方法的评估。使用六个不同的数据库比较了五种最先进的GCI检测算法,其中六个不同的数据库作为地面真理,并包含多个扬声器的许多小时的语音。比较的五种技术是基于Hilbert Invelope的检测(HE),基于零频率谐振器的方法(ZFR),动态编程相位斜率算法(DYPSA),使用残留激发和基于平均值的信号(SEDREAMS)的语音事件检测以及另一个GCI AlgorithM(Yaga)。这些方法的功效首先是根据可靠性和准确性来评估干净语音的。还评估了它们对加性噪声和回响的稳健性。本文的进一步贡献是评估其在语音处理的具体应用中的表现:语音的因果 - 纳斯卡苏尔分解。结果表明,在识别率和准确性方面,对于干净的语音,Sedreams和Yaga都是最佳性能技术。 ZFR和Sedreams还表现出对加性噪声和混响的卓越性。

The pseudo-periodicity of voiced speech can be exploited in several speech processing applications. This requires however that the precise locations of the Glottal Closure Instants (GCIs) are available. The focus of this paper is the evaluation of automatic methods for the detection of GCIs directly from the speech waveform. Five state-of-the-art GCI detection algorithms are compared using six different databases with contemporaneous electroglottographic recordings as ground truth, and containing many hours of speech by multiple speakers. The five techniques compared are the Hilbert Envelope-based detection (HE), the Zero Frequency Resonator-based method (ZFR), the Dynamic Programming Phase Slope Algorithm (DYPSA), the Speech Event Detection using the Residual Excitation And a Mean-based Signal (SEDREAMS) and the Yet Another GCI Algorithm (YAGA). The efficacy of these methods is first evaluated on clean speech, both in terms of reliabililty and accuracy. Their robustness to additive noise and to reverberation is also assessed. A further contribution of the paper is the evaluation of their performance on a concrete application of speech processing: the causal-anticausal decomposition of speech. It is shown that for clean speech, SEDREAMS and YAGA are the best performing techniques, both in terms of identification rate and accuracy. ZFR and SEDREAMS also show a superior robustness to additive noise and reverberation.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源