通过自动化功能工程提升可解释性的绩效权衡

论文标题

通过自动化功能工程提升可解释性的绩效权衡

Lifting Interpretability-Performance Trade-off via Automated Feature Engineering

论文作者

Gosiewska, Alicja, Biecek, Przemyslaw

论文摘要

复杂的黑盒预测模型可能具有高性能，但是缺乏可解释性会导致缺乏信任，缺乏稳定性，对概念漂移的敏感性等问题。另一方面，实现可解释模型的令人满意的准确性需要与功能工程有关的更多耗时的工作。我们可以在没有永恒功能工程的情况下训练可解释和准确的模型吗？我们提出了一种使用弹性黑盒作为替代模型来创建更简单，不透明但仍然准确且可解释的玻璃盒模型的方法。在替代模型的帮助下，在提取的新工程功能上创建了新模型。我们在OpenML数据库的几个表格数据集上通过大规模基准提供了分析。有两个结果1）从复杂模型中提取信息可能会改善线性模型的性能，2）质疑复杂的机器学习模型超过线性模型的常见神话。

Complex black-box predictive models may have high performance, but lack of interpretability causes problems like lack of trust, lack of stability, sensitivity to concept drift. On the other hand, achieving satisfactory accuracy of interpretable models require more time-consuming work related to feature engineering. Can we train interpretable and accurate models, without timeless feature engineering? We propose a method that uses elastic black-boxes as surrogate models to create a simpler, less opaque, yet still accurate and interpretable glass-box models. New models are created on newly engineered features extracted with the help of a surrogate model. We supply the analysis by a large-scale benchmark on several tabular data sets from the OpenML database. There are two results 1) extracting information from complex models may improve the performance of linear models, 2) questioning a common myth that complex machine learning models outperform linear models.

下载PDF全文

下载文献需遵守相关版权规定

论文标题