论文标题
精炼老年人的自动语音识别系统
Refining Automatic Speech Recognition System for older adults
论文作者
论文摘要
使用有限的培训数据建立高质量的自动语音识别(ASR)系统,这是一项艰巨的任务,特别是对于狭窄的目标人群而言。开源的ASR系统,经过成人足够数据的培训,由于成人和老年人之间的声学不匹配而对老年人的言语敏感。借助12个小时的培训数据,我们试图为社会孤立的老年人(80岁以上)开发一种可能的认知障碍。我们在实验上确定,成年人口的ASR在我们的目标人群中的表现不佳,并且转移学习(TL)可以提高系统的性能。站在TL,调整模型参数的基本思想上,我们通过利用注意机制利用模型的中间信息来进一步改善系统。我们的方法比TL模型实现了1.58%的绝对改进。
Building a high quality automatic speech recognition (ASR) system with limited training data has been a challenging task particularly for a narrow target population. Open-sourced ASR systems, trained on sufficient data from adults, are susceptible on seniors' speech due to acoustic mismatch between adults and seniors. With 12 hours of training data, we attempt to develop an ASR system for socially isolated seniors (80+ years old) with possible cognitive impairments. We experimentally identify that ASR for the adult population performs poorly on our target population and transfer learning (TL) can boost the system's performance. Standing on the fundamental idea of TL, tuning model parameters, we further improve the system by leveraging an attention mechanism to utilize the model's intermediate information. Our approach achieves 1.58% absolute improvements over the TL model.