论文标题

基于波斯ASR的SER:Sharif情感语音数据库的修改和波斯文本语料库的调查

A Persian ASR-based SER: Modification of Sharif Emotional Speech Database and Investigation of Persian Text Corpora

论文作者

Yazdani, Ali, Shekofteh, Yasser

论文摘要

言语情感识别(SER)是人类在理解情况以及如何与他人互动的基本感知方法之一,因此,近年来,它已被试图增加识别情绪的能力,以在人类机器的交流系统中识别情绪。由于SER流程依赖于标记的数据,因此数据库对其至关重要。不完整,低质量或有缺陷的数据可能导致预测不准确。在本文中,我们通过使用自动语音识别(ASR)系统来确定Sharif情感语音数据库(Shemo)的不一致性,并作为波斯数据库,并研究了从可访问的波斯文本Corpora获得的FARSI语言模型的效果。我们还引入了一个基于波斯语/FARSI ASR的SER系统,该系统使用ASR输出和基于深度学习的模型的语言特征。

Speech Emotion Recognition (SER) is one of the essential perceptual methods of humans in understanding the situation and how to interact with others, therefore, in recent years, it has been tried to add the ability to recognize emotions to human-machine communication systems. Since the SER process relies on labeled data, databases are essential for it. Incomplete, low-quality or defective data may lead to inaccurate predictions. In this paper, we fixed the inconsistencies in Sharif Emotional Speech Database (ShEMO), as a Persian database, by using an Automatic Speech Recognition (ASR) system and investigating the effect of Farsi language models obtained from accessible Persian text corpora. We also introduced a Persian/Farsi ASR-based SER system that uses linguistic features of the ASR outputs and Deep Learning-based models.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源