论文标题

两步方法来利用上下文数据:空中交通通信中的语音识别

A two-step approach to leverage contextual data: speech recognition in air-traffic communications

论文作者

Nigmatulina, Iuliia, Zuluaga-Gomez, Juan, Prasad, Amrutha, Sarfjoo, Seyyed Saeed, Motlicek, Petr

论文摘要

自动语音识别(ASR)作为飞行员与空中流量控制者之间语音交流的帮助,可以显着降低任务的复杂性并提高传输信息的可靠性。 ASR应用可能会导致误解并提高空中交通管理(ATM)效率造成的事件数量较低。显然,需要高精度预测,尤其是关键信息,即呼号和命令,以最大程度地减少错误的风险。我们证明,将ASR和自然语言处理的好处(NLP)方法结合起来,使用监视数据(即其他模式)有助于大大提高对呼号的识别(命名实体)。 In this paper, we investigate a two-step callsign boosting approach: (1) at the 1 step (ASR), weights of probable callsign n-grams are reduced in G.fst and/or in the decoding FST (lattices), (2) at the 2 step (NLP), callsigns extracted from the improved recognition outputs with Named Entity Recognition (NER) are correlated with the surveillance data to select the most suitable one.通过ASR和NLP方法的组合增强呼号N-Grams最终导致了绝对的53.7%,或相对的60.4%的呼号识别改善。

Automatic Speech Recognition (ASR), as the assistance of speech communication between pilots and air-traffic controllers, can significantly reduce the complexity of the task and increase the reliability of transmitted information. ASR application can lead to a lower number of incidents caused by misunderstanding and improve air traffic management (ATM) efficiency. Evidently, high accuracy predictions, especially, of key information, i.e., callsigns and commands, are required to minimize the risk of errors. We prove that combining the benefits of ASR and Natural Language Processing (NLP) methods to make use of surveillance data (i.e. additional modality) helps to considerably improve the recognition of callsigns (named entity). In this paper, we investigate a two-step callsign boosting approach: (1) at the 1 step (ASR), weights of probable callsign n-grams are reduced in G.fst and/or in the decoding FST (lattices), (2) at the 2 step (NLP), callsigns extracted from the improved recognition outputs with Named Entity Recognition (NER) are correlated with the surveillance data to select the most suitable one. Boosting callsign n-grams with the combination of ASR and NLP methods eventually leads up to 53.7% of an absolute, or 60.4% of a relative, improvement in callsign recognition.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源