论文标题

一种新的基因选择算法,使用模糊 - 粗糙套件进行肿瘤分类的理论

A New Gene Selection Algorithm using Fuzzy-Rough Set Theory for Tumor Classification

论文作者

Farahbakhshian, Seyedeh Faezeh, Ahvanooey, Milad Taleby

论文摘要

在统计和机器学习中,特征选择是选择用于预测模型中的相关属性子集的过程。最近,采用特征依赖性来执行选择过程的基于粗糙的特征选择技术已引起人们的注意。基于基因表达的肿瘤分类用于诊断生物信息学应用中疾病的适当治疗和预后。微阵列基因表达数据包括高维度和较小训练实例的多余特征基因。由于在此类高维问题中对基因表达实例进行精确监督分类非常复杂,因此选择合适的基因是肿瘤分类的至关重要任务。在这项研究中,我们提出了一种使用模糊式rough套件的可见性矩阵,用于选择基因选择的新技术。该提出的技术考虑了具有相同和不同类标签的实例的相似性,以改善基因选择结果,而先前的最新方法仅解决具有不同类标签的实例的相似性。为了满足这一要求,我们将约翰逊还原技术扩展到模糊案例。实验结果表明,与最先进的方法相比,该技术提供了更好的效率。

In statistics and machine learning, feature selection is the process of picking a subset of relevant attributes for utilizing in a predictive model. Recently, rough set-based feature selection techniques, that employ feature dependency to perform selection process, have been drawn attention. Classification of tumors based on gene expression is utilized to diagnose proper treatment and prognosis of the disease in bioinformatics applications. Microarray gene expression data includes superfluous feature genes of high dimensionality and smaller training instances. Since exact supervised classification of gene expression instances in such high-dimensional problems is very complex, the selection of appropriate genes is a crucial task for tumor classification. In this study, we present a new technique for gene selection using a discernibility matrix of fuzzy-rough sets. The proposed technique takes into account the similarity of those instances that have the same and different class labels to improve the gene selection results, while the state-of-the art previous approaches only address the similarity of instances with different class labels. To meet that requirement, we extend the Johnson reducer technique into the fuzzy case. Experimental results demonstrate that this technique provides better efficiency compared to the state-of-the-art approaches.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源