论文标题
信用评分的机器学习方法
Machine Learning approach for Credit Scoring
论文作者
论文摘要
在这项工作中,我们建立了一堆机器学习模型,旨在构成最先进的信用评级和默认预测系统,从而获得出色的样本外部表演。我们的方法是通过最新的ML / AI概念进行游览,从使用嵌入式和自动编码器(AE)应用于经济领域的自然语言过程(NLP)开始,通过使用渐进式增强机器(GBM)的概率(GBM)的范围来求解范围的经济特征的基础,从而贯穿了可拖动的范围(gbm)和校准,请访问其范围的范围。最后,我们通过遗传算法(差异进化,de)分配信用评级。模型可解释性是通过实施最新技术(例如Shap and Lime)来实现的,这些技术在特征的空间中解释了本地的预测。
In this work we build a stack of machine learning models aimed at composing a state-of-the-art credit rating and default prediction system, obtaining excellent out-of-sample performances. Our approach is an excursion through the most recent ML / AI concepts, starting from natural language processes (NLP) applied to economic sectors' (textual) descriptions using embedding and autoencoders (AE), going through the classification of defaultable firms on the base of a wide range of economic features using gradient boosting machines (GBM) and calibrating their probabilities paying due attention to the treatment of unbalanced samples. Finally we assign credit ratings through genetic algorithms (differential evolution, DE). Model interpretability is achieved by implementing recent techniques such as SHAP and LIME, which explain predictions locally in features' space.