论文标题

峰值优先的CTC:通过应用峰值优先化来降低CTC模型的峰值潜伏期

Peak-First CTC: Reducing the Peak Latency of CTC Models by Applying Peak-First Regularization

论文作者

Tian, Zhengkun, Xiang, Hongyu, Li, Min, Lin, Feifei, Ding, Ke, Wan, Guanglu

论文摘要

CTC模型由于其简单的结构,出色的性能和快速推理速度而被广泛应用于许多应用程序方案。 CTC模型预测的概率分布中有许多峰,每个峰代表非蓝色令牌。可以通过鼓励模型较早预测峰值来减少CTC模型的识别潜伏期。现有的减少延迟的方法需要修改前向算法中令牌与梯度计算之间的过渡关系。其中一些方法甚至取决于其他预验证的模型提供的强制对齐结果。以上方法很复杂。为了降低峰潜伏期,我们提出了一种名为峰值正则化的简单新颖的方法,该方法利用框架知识蒸馏函数迫使CTC模型的概率分布沿时间轴向左移动,而不是直接修改CTC损耗和梯度的计算过程。所有实验均在中国普通话数据集Aishell-1上进行。我们分别验证了所提出的正则化对流和非流式CTC模型的有效性。结果表明,所提出的方法可以将平均峰潜伏期降低约100至200毫秒,而识别精度几乎没有降解。

The CTC model has been widely applied to many application scenarios because of its simple structure, excellent performance, and fast inference speed. There are many peaks in the probability distribution predicted by the CTC models, and each peak represents a non-blank token. The recognition latency of CTC models can be reduced by encouraging the model to predict peaks earlier. Existing methods to reduce latency require modifying the transition relationship between tokens in the forward-backward algorithm, and the gradient calculation. Some of these methods even depend on the forced alignment results provided by other pretrained models. The above methods are complex to implement. To reduce the peak latency, we propose a simple and novel method named peak-first regularization, which utilizes a frame-wise knowledge distillation function to force the probability distribution of the CTC model to shift left along the time axis instead of directly modifying the calculation process of CTC loss and gradients. All the experiments are conducted on a Chinese Mandarin dataset AISHELL-1. We have verified the effectiveness of the proposed regularization on both streaming and non-streaming CTC models respectively. The results show that the proposed method can reduce the average peak latency by about 100 to 200 milliseconds with almost no degradation of recognition accuracy.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源