论文标题
在线稀疏流媒体功能选择算法
An Online Sparse Streaming Feature Selection Algorithm
论文作者
论文摘要
以在线方式进行功能选择的在线流媒体特征选择(OSFS)在处理高维数据中起着重要作用。在许多真实的应用程序(例如智能医疗平台)中,流媒体功能始终存在一些缺少的数据,这在进行OSF时引起了至关重要的挑战,即如何在稀疏流媒体功能和标签之间建立不确定的关系。不幸的是,现有的OSFS算法从未考虑过这种不确定的关系。为了填补这一空白,我们在本文中提出了一种不确定性(OS2FSU)算法的在线稀疏流媒体特征选择。 OS2FSU由两个主要部分组成:1)潜在因子分析用于预测稀疏流特征中缺少的数据,然后再进行连接特征选择,而2)使用模糊逻辑和邻域粗糙集来减轻导电功能选择过程中估计的流式特征和标签的不确定性。在实验中,将OS2FSU与六个真实数据集中的五种最先进的OSFS算法进行了比较。结果表明,在OSF中遇到丢失的数据时,OS2FSU胜过其竞争对手。
Online streaming feature selection (OSFS), which conducts feature selection in an online manner, plays an important role in dealing with high-dimensional data. In many real applications such as intelligent healthcare platform, streaming feature always has some missing data, which raises a crucial challenge in conducting OSFS, i.e., how to establish the uncertain relationship between sparse streaming features and labels. Unfortunately, existing OSFS algorithms never consider such uncertain relationship. To fill this gap, we in this paper propose an online sparse streaming feature selection with uncertainty (OS2FSU) algorithm. OS2FSU consists of two main parts: 1) latent factor analysis is utilized to pre-estimate the missing data in sparse streaming features before con-ducting feature selection, and 2) fuzzy logic and neighborhood rough set are employed to alleviate the uncertainty between estimated streaming features and labels during conducting feature selection. In the experiments, OS2FSU is compared with five state-of-the-art OSFS algorithms on six real datasets. The results demonstrate that OS2FSU outperforms its competitors when missing data are encountered in OSFS.