使用惯性传感器的多域多模式融合用于人类动作识别

论文标题

使用惯性传感器的多域多模式融合用于人类动作识别

Multidomain Multimodal Fusion For Human Action Recognition Using Inertial Sensors

论文作者

Ahmad, Zeeshan, Khan, Naimul

论文摘要

在行动识别过程中对多重动作进行错误分类的主要原因之一是互补特征的不可用，这些功能提供了有关动作的语义信息。在不同的域中，这些特征具有不同的尺度和强度。在现有文献中，特征是在不同领域中独立提取的，但是融合这些多域特征所带来的好处并未实现。为了应对这一挑战并提取完整的互补信息集，在本文中，我们提出了一个新型的多域多模式融合框架，该框架从输入方式的不同领域提取互补和不同特征。我们将输入惯性数据转换为信号图像，然后分别使用离散的傅立叶变换（DFT）和Gabor Wavelet变换（GWT）将空间域信息转换为频率和时间光谱域，从而使输入模态多模式和多模式。不同领域的特征是通过卷积神经网络（CNN）提取的，然后通过基于规范相关的融合（CCF）融合以提高人类作用识别的准确性。三个惯性数据集的实验结果表明，与最先进的方法相比，该方法的优越性。

One of the major reasons for misclassification of multiplex actions during action recognition is the unavailability of complementary features that provide the semantic information about the actions. In different domains these features are present with different scales and intensities. In existing literature, features are extracted independently in different domains, but the benefits from fusing these multidomain features are not realized. To address this challenge and to extract complete set of complementary information, in this paper, we propose a novel multidomain multimodal fusion framework that extracts complementary and distinct features from different domains of the input modality. We transform input inertial data into signal images, and then make the input modality multidomain and multimodal by transforming spatial domain information into frequency and time-spectrum domain using Discrete Fourier Transform (DFT) and Gabor wavelet transform (GWT) respectively. Features in different domains are extracted by Convolutional Neural networks (CNNs) and then fused by Canonical Correlation based Fusion (CCF) for improving the accuracy of human action recognition. Experimental results on three inertial datasets show the superiority of the proposed method when compared to the state-of-the-art.

下载PDF全文

下载文献需遵守相关版权规定

论文标题