语音到无限应用的轻型变压器

论文标题

语音到无限应用的轻型变压器

A light transformer for speech-to-intent applications

论文作者

Wang, Pu, Van hamme, Hugo

论文摘要

口语理解（SLU）系统可以使生活更加令人愉快，更安全（例如在汽车中），或者可以提高身体挑战的用户的独立性。但是，由于语音的许多差异来源，训练有素的系统很难转移到其他条件（例如不同语言或语音受损的用户）中。一种补救措施是设计一个用户熟练的SLU系统，该系统可以从用户的演示中全面学习，这又要求该系统的模型仅在几个培训样本后迅速收敛。在本文中，我们通过使用简化的相对位置编码降低模型大小并提高效率的目标，提出了一个光变压器结构。 Light Transformer是现有用户教授的多任务SLU系统的替代语音编码器。具有挑战性语音条件的三个数据集的实验结果证明，我们的方法的表现优于存在的系统和其他最先进的模型，其中一半的模型大小和训练时间。

Spoken language understanding (SLU) systems can make life more agreeable, safer (e.g. in a car) or can increase the independence of physically challenged users. However, due to the many sources of variation in speech, a well-trained system is hard to transfer to other conditions like a different language or to speech impaired users. A remedy is to design a user-taught SLU system that can learn fully from scratch from users' demonstrations, which in turn requires that the system's model quickly converges after only a few training samples. In this paper, we propose a light transformer structure by using a simplified relative position encoding with the goal to reduce the model size and improve efficiency. The light transformer works as an alternative speech encoder for an existing user-taught multitask SLU system. Experimental results on three datasets with challenging speech conditions prove our approach outperforms the existed system and other state-of-art models with half of the original model size and training time.

下载PDF全文

下载文献需遵守相关版权规定

论文标题