论文标题
使用SIMD指令优化决策树评估
Optimization of Decision Tree Evaluation Using SIMD Instructions
论文作者
论文摘要
决策森林(决策树合奏)是最受欢迎的机器学习算法之一。要在大数据上使用大型模型,例如通过学习到级模型的文档评分,我们需要有效地评估这些模型。在本文中,我们探索了流行的Catboost库的祖先Matrixnet。两个库都使用SSE指令集在CPU上进行评分。本文研究了AVX指导设置的机会,以更有效地评估模型。我们在二进制阶段(节点条件比较)达到了35%的加速,而在树木上进行了20%的速度,在排名模型上采用了阶段。
Decision forest (decision tree ensemble) is one of the most popular machine learning algorithms. To use large models on big data, like document scoring with learning-to-rank models, we need to evaluate these models efficiently. In this paper, we explore MatrixNet, the ancestor of the popular CatBoost library. Both libraries use the SSE instruction set for scoring on CPU. This paper investigates the opportunities given by the AVX instruction set to evaluate models more efficiently. We achieved 35% speedup on the binarization stage (nodes conditions comparison), and 20% speedup on the trees apply stage on the ranking model.