论文标题

数据拆分的最佳比率

Optimal Ratio for Data Splitting

论文作者

Joseph, V. Roshan

论文摘要

在拟合统计或机器学习模型之前,通常将数据集分为培训和测试集。但是,对于培训和测试,没有明确的指导。在本文中,我们表明,最佳分裂比为$ \ sqrt {p}:1 $,其中$ p $是线性回归模型中的参数数,可以很好地解释数据。

It is common to split a dataset into training and testing sets before fitting a statistical or machine learning model. However, there is no clear guidance on how much data should be used for training and testing. In this article we show that the optimal splitting ratio is $\sqrt{p}:1$, where $p$ is the number of parameters in a linear regression model that explains the data well.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源