时间变化的语音信号分析的谐波模型

论文标题

时间变化的语音信号分析的谐波模型

Time-varying harmonic models for voice signal analysis

论文作者

Ikuma, Takeshi, McWhorter, Andrew J., Adkins, Lacey, Kunduk, Melda

论文摘要

长期以来对语音信号的评估一直以周期性的假设进行，因为这有助于分析。普通语音信号的近期周期性使短期谐波建模成为提取声带参数的吸引人的选择。但是，对于吞咽障碍的声音，固定的谐波结构可能会受到过于限制，因为它严格执行了模型中的周期性。信号中振幅或频率的轻微变化可能导致模型歪曲观察到的信号。为了解决这些问题，本文提出了一个随时间变化的谐波模型，该模型允许其基本频率和谐波幅度作为时间的多项式函数。该模型将频率和振幅缓慢偏离，与快速不规则的声带振动行为（例如亚肝素和外交部）。显示时间变化的模型显示出具有严重震颤的声音中存在的频率和振幅调制。这降低了基于模型的谐波与噪声比率的灵敏度，以减慢频率和振幅变化，同时保持其对增加湍流噪声或不规则振动的敏感性。该模型的其他用途包括声道滤波器估计以及频率和强度变化的速率。这些用例与建模精度一起进行了实验证明。

Assessment of voice signals has long been performed with the assumption of periodicity as this facilitates analysis. Near periodicity of normal voice signals makes short-time harmonic modeling an appealing choice to extract vocal feature parameters. For dysphonic voice, however, a fixed harmonic structure could be too constrained as it strictly enforces periodicity in the model. Slight variation in amplitude or frequency in the signal may cause the model to misrepresent the observed signal. To address these issues, this paper presents a time-varying harmonic model, which allows its fundamental frequency and harmonic amplitudes to be polynomial functions of time. The model decouples the slow deviations of frequency and amplitude from fast irregular vocal fold vibratory behaviors such as subharmonics and diplophonia. The time-varying model is shown to track the frequency and amplitude modulations present in voice with severe tremor. This reduces the sensitivity of the model-based harmonics-to-noise ratio measures to slow frequency and amplitude variations while maintaining its sensitivity to increase in turbulent noise or the presence of irregular vibration. Other uses of the model include the vocal tract filter estimation and the rates of frequency and intensity changes. These use cases are experimentally demonstrated along with the modeling accuracy.

下载PDF全文

下载文献需遵守相关版权规定

论文标题