论文标题

多模式系统:分类法,方法和挑战

Multimodal Systems: Taxonomy, Methods, and Challenges

论文作者

Baig, Muhammad Zeeshan, Kavakli, Manolya

论文摘要

自然,人类使用多种方式传达信息。依次和平行处理人类大脑通信的方式,当人类与计算机相互作用时,这种方式会发生变化。授权计算机具有多模式输入的能力是人类计算机相互作用(HCI)的主要研究领域。技术的进步(功能强大的移动设备,高级传感器,新的输出方式等)为研究人员开辟了新的网关,以设计允许多模式相互作用的系统。多模式输入将超越传统的互动方式是时间问题。本文介绍了多模式系统的领域,解释了一个简短的历史记录,描述了多模式系统比单峰系统的优势,并讨论了各种模式。讨论了输入建模,融合和数据收集。最后,列出了多模式系统研究中的挑战。对文献的分析表明,与单峰系统相比,多模式界面系统提高了任务完成率并减少了错误。多模式相互作用的常用输入是语音和手势。在多模式输入的情况下,研究人员首选输入方式的后期整合,因为它可以轻松更新模态和相应的词汇。

Naturally, humans use multiple modalities to convey information. The modalities are processed both sequentially and in parallel for communication in the human brain, this changes when humans interact with computers. Empowering computers with the capability to process input multimodally is a major domain of investigation in Human-Computer Interaction (HCI). The advancement in technology (powerful mobile devices, advanced sensors, new ways of output, etc.) has opened up new gateways for researchers to design systems that allow multimodal interaction. It is a matter of time when the multimodal inputs will overtake the traditional ways of interactions. The paper provides an introduction to the domain of multimodal systems, explains a brief history, describes advantages of multimodal systems over unimodal systems, and discusses various modalities. The input modeling, fusion, and data collection were discussed. Finally, the challenges in the multimodal systems research were listed. The analysis of the literature showed that multimodal interface systems improve the task completion rate and reduce the errors compared to unimodal systems. The commonly used inputs for multimodal interaction are speech and gestures. In the case of multimodal inputs, late integration of input modalities is preferred by researchers because it allows easy update of modalities and corresponding vocabularies.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源