从五分钟的语音样本中自动检测表达的情绪：挑战和机遇

论文标题

从五分钟的语音样本中自动检测表达的情绪：挑战和机遇

Automatic Detection of Expressed Emotion from Five-Minute Speech Samples: Challenges and Opportunities

论文作者

Mirheidari, Bahman, Bittar, André, Cummins, Nicholas, Downs, Johnny, Fisher, Helen L., Christensen, Heidi

论文摘要

我们介绍了一项关于自动识别表达情绪（EE）的新颖可行性研究，这是一个基于护理人员自由谈论其相对/家庭成员的家庭环境概念。我们描述了一种自动化方法，用于确定\ textit {温暖程度}，这是EE的关键组成部分，它是从37个记录的访谈样本中获得的声学和文本特征。这些录音是在20年前收集的，源自2,232个英国双胞胎儿童的全国代表性出生队列，并被手动编码为EE。我们概述了从具有高度可变音频质量的录音中提取可用信息的核心步骤，并评估了四种用不同组合和文本功能组合的机器学习方法的功效。尽管使用此遗产数据面临挑战，但我们证明了可以通过$ f_ {1} $ - \ textbf {61.5 \％}的得分来预测温暖程度。在本文中，我们总结了我们的学习，并为未来的现实语音样本提供了建议。

We present a novel feasibility study on the automatic recognition of Expressed Emotion (EE), a family environment concept based on caregivers speaking freely about their relative/family member. We describe an automated approach for determining the \textit{degree of warmth}, a key component of EE, from acoustic and text features acquired from a sample of 37 recorded interviews. These recordings, collected over 20 years ago, are derived from a nationally representative birth cohort of 2,232 British twin children and were manually coded for EE. We outline the core steps of extracting usable information from recordings with highly variable audio quality and assess the efficacy of four machine learning approaches trained with different combinations of acoustic and text features. Despite the challenges of working with this legacy data, we demonstrated that the degree of warmth can be predicted with an $F_{1}$-score of \textbf{61.5\%}. In this paper, we summarise our learning and provide recommendations for future work using real-world speech samples.

下载PDF全文

下载文献需遵守相关版权规定

论文标题