论文标题

GSR:广义符号回归方法

GSR: A Generalized Symbolic Regression Approach

论文作者

Tohme, Tony, Liu, Dehong, Youcef-Toumi, Kamal

论文摘要

确定最能描述数据集的数学关系仍然是机器学习中非常具有挑战性的问题,被称为符号回归(SR)。与通常被视为黑匣子的神经网络相反,SR试图通过组装分析功能来深入了解自变量之间的基本关系和给定数据集的目标变量。在本文中,我们通过修改常规的SR优化问题制定,同时保持主要的SR目标完整。在GSR中,我们推断自变量与目标变量的某些变换之间的数学关系。我们将搜索空间限制为基础功能的加权总和,并通过基于矩阵的编码方案提出了遗传编程方法。我们表明,我们的GSR方法具有强大的SR基准方法具有竞争力,在众所周知的SR基准问题集上实现了有希望的实验性能。最后,我们通过引入Symset,这是一种新的SR基准集,相对于现有基准,它更具挑战性,我们强调了GSR的优势。

Identifying the mathematical relationships that best describe a dataset remains a very challenging problem in machine learning, and is known as Symbolic Regression (SR). In contrast to neural networks which are often treated as black boxes, SR attempts to gain insight into the underlying relationships between the independent variables and the target variable of a given dataset by assembling analytical functions. In this paper, we present GSR, a Generalized Symbolic Regression approach, by modifying the conventional SR optimization problem formulation, while keeping the main SR objective intact. In GSR, we infer mathematical relationships between the independent variables and some transformation of the target variable. We constrain our search space to a weighted sum of basis functions, and propose a genetic programming approach with a matrix-based encoding scheme. We show that our GSR method is competitive with strong SR benchmark methods, achieving promising experimental performance on the well-known SR benchmark problem sets. Finally, we highlight the strengths of GSR by introducing SymSet, a new SR benchmark set which is more challenging relative to the existing benchmarks.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源