论文标题
SurfMyoAir:基于表面肌电图的基于空气写作识别的框架
SurfMyoAiR: A surface Electromyography based framework for Airwriting Recognition
论文作者
论文摘要
空气写作识别是识别用手指运动在自由空间中写的字母的任务。肌电图(EMG)是一种用于记录肌肉收缩期间电活动和由于运动而放松的技术,并且被广泛用于手势识别。当前关于手势识别的大多数研究都集中在识别静态手势上。但是,动态手势是自然的和用户友好的,因为被用作人类计算机交互应用中的替代输入方法。因此,使用前臂肌肉记录的EMG信号的空气写入识别是可行的解决方案。由于用户不需要学习任何新的手势,并且可以通过连接这些字母来形成大量单词,因此可以将其推广到更广泛的人群中。使用EMG信号识别空气写作的工作有限,并构成了当前工作的核心思想。构建了由编写英语大写字母的EMG信号包含的SurfMyoAir数据集。探索了几个不同的时间域特征,用于构建EMG信封和两个不同的时频图像表示:短期傅立叶变换和连续的小波变换,以形成用于空气写作识别的深度学习模型的输入。为此任务利用了几种不同的深度学习体系结构。此外,全面探索了各种参数(例如信号长度,窗口长度和插值技术)对识别性能的影响。最好的精度分别是用户依赖性和独立方案的78.50%和62.19%,通过与2D基于2D卷积神经网络的分类器结合使用短时傅立叶变换。空气写作具有巨大的潜力作为用户友好的模式,可以用作人类计算机交互应用中的替代输入方法。
Airwriting Recognition is the task of identifying letters written in free space with finger movement. Electromyography (EMG) is a technique used to record electrical activity during muscle contraction and relaxation as a result of movement and is widely used for gesture recognition. Most of the current research in gesture recognition is focused on identifying static gestures. However, dynamic gestures are natural and user-friendly for being used as alternate input methods in Human-Computer Interaction applications. Airwriting recognition using EMG signals recorded from forearm muscles is therefore a viable solution. Since the user does not need to learn any new gestures and a large range of words can be formed by concatenating these letters, it is generalizable to a wider population. There has been limited work in recognition of airwriting using EMG signals and forms the core idea of the current work. The SurfMyoAiR dataset comprising of EMG signals recorded during writing English uppercase alphabets is constructed. Several different time-domain features to construct EMG envelope and two different time-frequency image representations: Short-Time Fourier Transform and Continuous Wavelet Transform were explored to form the input to a deep learning model for airwriting recognition. Several different deep learning architectures were exploited for this task. Additionally, the effect of various parameters such as signal length, window length and interpolation techniques on the recognition performance is comprehensively explored. The best-achieved accuracy was 78.50% and 62.19% in user-dependent and independent scenarios respectively by using Short-Time Fourier Transform in conjunction with a 2D Convolutional Neural Network based classifier. Airwriting has great potential as a user-friendly modality to be used as an alternate input method in Human-Computer Interaction applications.