论文标题

一种类似GA的动态概率方法,具有共同信息的共同信息

A GA-like Dynamic Probability Method With Mutual Information for Feature Selection

论文作者

Wang, Gaoshuai, Lauri, Fabrice, Hassani, Amir Hajjam El

论文摘要

功能选择在促进分类器的性能中起着至关重要的作用。但是,当前方法无效地区分了所选特征中的复杂相互作用。为了进一步删除这些隐藏的负相互作用,我们提出了一种具有两层结构的共同信息的GA样动态概率(GADP)方法。第一层应用了相互信息方法来获得主要特征子集。类似GA的动态概率算法作为第二层,基于以前的候选特征地挖掘出更多的支持特征。本质上,类似GA的方法是基于人群的算法之一,因此其工作机制与GA相似。与经常专注于改善GA的运营商以增强搜索能力和降低收敛时间的流行作品不同,我们大胆放弃GA的运营商,并采用动态概率,这些概率依赖于每个染色体的性能来确定新一代的特征选择。动态概率机制可显着减少GA中的参数编号,从而易于使用。由于每个基因的概率都是独立的,因此GADP中的染色体品种比传统GA中更为著名,这确保GADP具有更大的搜索空间,并更有效,准确地选择相关功能。为了验证我们的方法的优势,我们在15个数据集的多个条件下评估了我们的方法。结果表明了所提出的方法的表现要出色。通常,它具有最佳准确性。此外,我们还将提出的模型与POS,FPA和WOA等流行的启发式方法进行了比较。我们的模型仍然具有优于它们的优势。

Feature selection plays a vital role in promoting the classifier's performance. However, current methods ineffectively distinguish the complex interaction in the selected features. To further remove these hidden negative interactions, we propose a GA-like dynamic probability (GADP) method with mutual information which has a two-layer structure. The first layer applies the mutual information method to obtain a primary feature subset. The GA-like dynamic probability algorithm, as the second layer, mines more supportive features based on the former candidate features. Essentially, the GA-like method is one of the population-based algorithms so its work mechanism is similar to the GA. Different from the popular works which frequently focus on improving GA's operators for enhancing the search ability and lowering the converge time, we boldly abandon GA's operators and employ the dynamic probability that relies on the performance of each chromosome to determine feature selection in the new generation. The dynamic probability mechanism significantly reduces the parameter number in GA that making it easy to use. As each gene's probability is independent, the chromosome variety in GADP is more notable than in traditional GA, which ensures GADP has a wider search space and selects relevant features more effectively and accurately. To verify our method's superiority, we evaluate our method under multiple conditions on 15 datasets. The results demonstrate the outperformance of the proposed method. Generally, it has the best accuracy. Further, we also compare the proposed model to the popular heuristic methods like POS, FPA, and WOA. Our model still owns advantages over them.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源