PGMM内核回归和与增强树的比较

论文标题

PGMM内核回归和与增强树的比较

pGMM Kernel Regression and Comparisons with Boosted Trees

论文作者

Li, Ping, Zhao, Weijie

论文摘要

在这项工作中，我们证明了在（山脊）回归的背景下，PGMM（````功率''广义Min-Max'）内核的优势。在最近的先前研究中，PGMM内核已被广泛评估，用于分类任务，逻辑回归，支持向量机器以及深层神经网络。在本文中，我们提供了一项有关脊回归的实验研究，以将PGMM内核回归与普通脊线性回归以及RBF内核脊回归进行比较。也许令人惊讶的是，即使没有调谐参数（即PGMM内核的功率参数$ p = 1 $），PGMM内核的性能已经很好。此外，通过调整参数$ p $，此（看似简单的）PGMM内核甚至与增强的树相当可观。在机器学习实践中，增强和增强的树木非常受欢迎。对于回归任务，通常，从业者使用$ L_2 $ BOOST，即最大程度地减少$ L_2 $损失。有时，出于鲁棒性，$ L_1 $提升可能是一种选择。在这项研究中，我们实现了$ p \ geq 1 $的$ l_p $ boost，并将其包含在``fast abc-boost''的包装中。也许同样令人惊讶的是，最佳性能（就$ L_2 $回归损失而言）通常以$ p> 2 $的价格获得，在某些情况下为$ p \ gg 2 $。在使用$ l_p $ distances的K-Nearest邻居分类的背景下，Li等人（UAI 2010）已经证明了这一现象。总而言之，实施$ L_P $ BOOST为从业人员提供了更高的灵活性来调整促进算法，从而有可能在回归应用中提高准确性。

In this work, we demonstrate the advantage of the pGMM (``powered generalized min-max'') kernel in the context of (ridge) regression. In recent prior studies, the pGMM kernel has been extensively evaluated for classification tasks, for logistic regression, support vector machines, as well as deep neural networks. In this paper, we provide an experimental study on ridge regression, to compare the pGMM kernel regression with the ordinary ridge linear regression as well as the RBF kernel ridge regression. Perhaps surprisingly, even without a tuning parameter (i.e., $p=1$ for the power parameter of the pGMM kernel), the pGMM kernel already performs well. Furthermore, by tuning the parameter $p$, this (deceptively simple) pGMM kernel even performs quite comparably to boosted trees. Boosting and boosted trees are very popular in machine learning practice. For regression tasks, typically, practitioners use $L_2$ boost, i.e., for minimizing the $L_2$ loss. Sometimes for the purpose of robustness, the $L_1$ boost might be a choice. In this study, we implement $L_p$ boost for $p\geq 1$ and include it in the package of ``Fast ABC-Boost''. Perhaps also surprisingly, the best performance (in terms of $L_2$ regression loss) is often attained at $p>2$, in some cases at $p\gg 2$. This phenomenon has already been demonstrated by Li et al (UAI 2010) in the context of k-nearest neighbor classification using $L_p$ distances. In summary, the implementation of $L_p$ boost provides practitioners the additional flexibility of tuning boosting algorithms for potentially achieving better accuracy in regression applications.

下载PDF全文

下载文献需遵守相关版权规定

论文标题