论文标题
哪个是我数据的最佳模型?
Which is the best model for my data?
论文作者
论文摘要
在本文中,我们解决了为给定结构化模式分类数据集选择最佳模型的问题。在这种情况下,模型可以理解为分类器和超参数配置。拟议的元学习方法纯粹依赖于机器学习,涉及四个主要步骤。首先,我们提出了62种元功能的简明集合,当汇总测量涉及正和阴性测量值的汇总值时,可以解决信息取消问题。其次,我们描述了旨在扩大培训数据的合成数据生成的两种不同的方法。第三,我们适合每个分类问题的一组预定义的分类模型,同时使用网格搜索优化其超参数。目的是创建一个元数据,以便每行表示描述特定问题的多标签实例。这些元可能性的功能表示生成的数据集的统计属性,而标签则将网格搜索结果编码为二进制矢量,以使表现最佳模型的标记为正面。最后,我们使用多个多标签分类器解决模型选择问题,包括旨在处理表格数据的卷积神经网络。仿真结果表明,我们的元学习方法可以正确预测91%的合成数据集和87%的现实数据集的最佳模型。此外,我们注意到使用元功能时,大多数元分类器都会产生更好的结果。总体而言,我们的建议与其他元学习方法不同,因为它可以在一步中解决算法选择和超参数调谐问题。最后,我们执行特征重要性分析,以确定哪些统计特征驱动模型选择机制。
In this paper, we tackle the problem of selecting the optimal model for a given structured pattern classification dataset. In this context, a model can be understood as a classifier and a hyperparameter configuration. The proposed meta-learning approach purely relies on machine learning and involves four major steps. Firstly, we present a concise collection of 62 meta-features that address the problem of information cancellation when aggregation measure values involving positive and negative measurements. Secondly, we describe two different approaches for synthetic data generation intending to enlarge the training data. Thirdly, we fit a set of pre-defined classification models for each classification problem while optimizing their hyperparameters using grid search. The goal is to create a meta-dataset such that each row denotes a multilabel instance describing a specific problem. The features of these meta-instances denote the statistical properties of the generated datasets, while the labels encode the grid search results as binary vectors such that best-performing models are positively labeled. Finally, we tackle the model selection problem with several multilabel classifiers, including a Convolutional Neural Network designed to handle tabular data. The simulation results show that our meta-learning approach can correctly predict an optimal model for 91% of the synthetic datasets and for 87% of the real-world datasets. Furthermore, we noticed that most meta-classifiers produced better results when using our meta-features. Overall, our proposal differs from other meta-learning approaches since it tackles the algorithm selection and hyperparameter tuning problems in a single step. Toward the end, we perform a feature importance analysis to determine which statistical features drive the model selection mechanism.