无需切换的代码转换：语言不可知论端到端语音翻译

论文标题

无需切换的代码转换：语言不可知论端到端语音翻译

Code-Switching without Switching: Language Agnostic End-to-End Speech Translation

论文作者

Huber, Christian, Ugan, Enes Yavuz, Waibel, Alexander

论文摘要

我们建议a）一种语言不可知论的端到端语音翻译模型（最后），b）提高代码转换（CS）性能的数据增强策略。随着全球化的增加，在流利的语音中，多种语言越来越多地互换。这样的CS使传统的语音识别和翻译变得复杂，因为我们必须首先识别出使用哪种语言，然后应用与语言有关的识别器和随后的翻译组件来生成所需的目标语言输出。这样的管道引入了延迟和错误。在本文中，我们通过将语音识别和翻译视为统一的端到端语音翻译问题来消除对此的需求。通过两种输入语言的最后培训，我们将语音解码为一种目标语言，而不论输入语言如何。最后，在单语用中提供了可比的识别和语音翻译精度，同时观察到CS时可大大降低延迟和错误率。

We propose a) a Language Agnostic end-to-end Speech Translation model (LAST), and b) a data augmentation strategy to increase code-switching (CS) performance. With increasing globalization, multiple languages are increasingly used interchangeably during fluent speech. Such CS complicates traditional speech recognition and translation, as we must recognize which language was spoken first and then apply a language-dependent recognizer and subsequent translation component to generate the desired target language output. Such a pipeline introduces latency and errors. In this paper, we eliminate the need for that, by treating speech recognition and translation as one unified end-to-end speech translation problem. By training LAST with both input languages, we decode speech into one target language, regardless of the input language. LAST delivers comparable recognition and speech translation accuracy in monolingual usage, while reducing latency and error rate considerably when CS is observed.

下载PDF全文

下载文献需遵守相关版权规定

论文标题