使用发现的低资源语言的语音单元基于声学的意图识别

论文标题

使用发现的低资源语言的语音单元基于声学的意图识别

Acoustics Based Intent Recognition Using Discovered Phonetic Units for Low Resource Languages

论文作者

Gupta, Akshat, Li, Xinjian, Rallabandi, Sai Krishna, Black, Alan W

论文摘要

随着语言技术的最新进展，人类现在正在与设备交谈。提高口头语言技术的范围需要以当地语言构建系统。这里的主要瓶颈是构成此类系统的潜在数据密集型零件，包括需要大量标记数据的自动语音识别（ASR）系统。为了帮助以低资源语言的方式开发口语对话系统，我们提出了一种新型基于声学的意图识别系统，该系统使用发现的语音单元进行意图分类。该系统由两个块组成 - 第一个块是通用电话识别系统，该系统为输入音频生成了发现的语音单元的成绩单，第二个块从生成的语音转录本中执行意图分类。我们建议使用CNN+LSTM的体系结构，并为两种语言系列（指示语言和浪漫语言）提供了两种不同意图识别任务的结果。我们还对我们的意图分类器进行了多种语言培训，并在同一语言家族中对未知语言的跨语性转移和零拍动表现进行了改进。

With recent advancements in language technologies, humans are now speaking to devices. Increasing the reach of spoken language technologies requires building systems in local languages. A major bottleneck here are the underlying data-intensive parts that make up such systems, including automatic speech recognition (ASR) systems that require large amounts of labelled data. With the aim of aiding development of spoken dialog systems in low resourced languages, we propose a novel acoustics based intent recognition system that uses discovered phonetic units for intent classification. The system is made up of two blocks - the first block is a universal phone recognition system that generates a transcript of discovered phonetic units for the input audio, and the second block performs intent classification from the generated phonetic transcripts. We propose a CNN+LSTM based architecture and present results for two languages families - Indic languages and Romance languages, for two different intent recognition tasks. We also perform multilingual training of our intent classifier and show improved cross-lingual transfer and zero-shot performance on an unknown language within the same language family.

下载PDF全文

下载文献需遵守相关版权规定

论文标题