论文标题

从头开始重新访问端到端的语音到文本翻译

Revisiting End-to-End Speech-to-Text Translation From Scratch

论文作者

Zhang, Biao, Haddow, Barry, Sennrich, Rico

论文摘要

端到端(E2E)语音到文本翻译(ST)通常取决于通过语音识别或文本翻译任务使用源成绩单预处理其编码器和/或解码器,恕不另行的文本转换任务,否则翻译性能会大大下降。但是,转录本并不总是可用的,在文献中很少研究这种预处理对E2E ST的重要性。在本文中,我们重新审视了这个问题,并探讨了仅在语音翻译对培训的E2E ST质量的程度。我们重新检查了几种证明对ST的有益的技术,并提供了一系列最佳实践,这些实践使基于变压器的E2E ST系统偏向于从头开始训练。此外,我们提出了参数化的距离惩罚,以促进语音自我注意模型中的位置建模。在涵盖23种语言的四个基准测试中,我们的实验表明,在不使用任何成绩单或预处理的情况下,提议的系统达到甚至优于先前采用预处理的研究,尽管差距仍然存在(极为)低资源的设置。最后,我们讨论了神经声学特征建模,其中神经模型旨在直接从原始语音信号中提取声学特征,目的是简化电感偏见,并在描述语音中为模型增添自由度。我们第一次证明了它的可行性,并在ST任务上表现出令人鼓舞的结果。

End-to-end (E2E) speech-to-text translation (ST) often depends on pretraining its encoder and/or decoder using source transcripts via speech recognition or text translation tasks, without which translation performance drops substantially. However, transcripts are not always available, and how significant such pretraining is for E2E ST has rarely been studied in the literature. In this paper, we revisit this question and explore the extent to which the quality of E2E ST trained on speech-translation pairs alone can be improved. We reexamine several techniques proven beneficial to ST previously, and offer a set of best practices that biases a Transformer-based E2E ST system toward training from scratch. Besides, we propose parameterized distance penalty to facilitate the modeling of locality in the self-attention model for speech. On four benchmarks covering 23 languages, our experiments show that, without using any transcripts or pretraining, the proposed system reaches and even outperforms previous studies adopting pretraining, although the gap remains in (extremely) low-resource settings. Finally, we discuss neural acoustic feature modeling, where a neural model is designed to extract acoustic features from raw speech signals directly, with the goal to simplify inductive biases and add freedom to the model in describing speech. For the first time, we demonstrate its feasibility and show encouraging results on ST tasks.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源