论文标题
带有二元神经网络的蒸馏非主义语音嵌入用于低资源设备的二元神经网络
Distilled Non-Semantic Speech Embeddings with Binary Neural Networks for Low-Resource Devices
论文作者
论文摘要
这项工作介绍了Brillsson,这是一种基于二进制神经网络的新型表示模型,用于广泛的非语义语音任务。我们从大型且价值的Trillsson模型中使用知识蒸馏来训练该模型,其中仅用于训练Trillsson的数据集的一小部分。由此产生的Brillsson型号的尺寸仅为2MB,潜伏期小于8ms,因此适合在低资源设备(例如可穿戴设备)中部署。我们在八项基准任务(包括但不限于口头语言识别,情感识别,健康状况诊断和关键字斑点)上评估布里尔森,并证明我们提出的拟议的超轻质和低延迟模型以及大型模型的性能。
This work introduces BRILLsson, a novel binary neural network-based representation learning model for a broad range of non-semantic speech tasks. We train the model with knowledge distillation from a large and real-valued TRILLsson model with only a fraction of the dataset used to train TRILLsson. The resulting BRILLsson models are only 2MB in size with a latency less than 8ms, making them suitable for deployment in low-resource devices such as wearables. We evaluate BRILLsson on eight benchmark tasks (including but not limited to spoken language identification, emotion recognition, health condition diagnosis, and keyword spotting), and demonstrate that our proposed ultra-light and low-latency models perform as well as large-scale models.