论文标题

手语变形金刚:端到端的手语识别和翻译

Sign Language Transformers: Joint End-to-end Sign Language Recognition and Translation

论文作者

Camgoz, Necati Cihan, Koller, Oscar, Hadfield, Simon, Bowden, Richard

论文摘要

先前关于手语翻译的工作表明,具有中层符号光泽表示(有效地识别单个标志)会大大改善翻译性能。实际上,翻译中的最新目前需要光泽水平的令牌化才能工作。我们介绍了一种基于变压器的新型体系结构,该体系结构共同学习连续的手语识别和翻译,同时可以端到端的方式进行训练。这是通过使用连接派时间分类(CTC)损失将识别和翻译问题绑定为单个统一体系结构的。这种联合方法不需要任何基础真相的时机信息,同时解决了两个共同依赖的序列到序列学习问题,并导致绩效的显着提高。 我们评估了在具有挑战性的RWTH-PHOENIX-WEATER-2014T(Phoenix14t)数据集上我们方法的识别和翻译性能。我们报告我们的手语变形金刚实现的最先进的手语识别和翻译结果。我们的翻译网络的表现既优于口语签名视频,又超过了口头语言翻译模型,在某些情况下,性能却增加了一倍以上(9.58 vs. 21.80 bleu-4得分)。我们还使用变压器网络共享新的基线翻译结果,用于其他几个文本到文本语言翻译任务。

Prior work on Sign Language Translation has shown that having a mid-level sign gloss representation (effectively recognizing the individual signs) improves the translation performance drastically. In fact, the current state-of-the-art in translation requires gloss level tokenization in order to work. We introduce a novel transformer based architecture that jointly learns Continuous Sign Language Recognition and Translation while being trainable in an end-to-end manner. This is achieved by using a Connectionist Temporal Classification (CTC) loss to bind the recognition and translation problems into a single unified architecture. This joint approach does not require any ground-truth timing information, simultaneously solving two co-dependant sequence-to-sequence learning problems and leads to significant performance gains. We evaluate the recognition and translation performances of our approaches on the challenging RWTH-PHOENIX-Weather-2014T (PHOENIX14T) dataset. We report state-of-the-art sign language recognition and translation results achieved by our Sign Language Transformers. Our translation networks outperform both sign video to spoken language and gloss to spoken language translation models, in some cases more than doubling the performance (9.58 vs. 21.80 BLEU-4 Score). We also share new baseline translation results using transformer networks for several other text-to-text sign language translation tasks.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源