论文标题
有影响力的点和样本量对多变量分数多项式模型的选择和可复制性的影响
Effects of Influential Points and Sample Size on the Selection and Replicability of Multivariable Fractional Polynomial Models
论文作者
论文摘要
多变量分数多项式(MFP)过程将变量选择与函数选择过程(FSP)结合在一起。对于连续变量,使用封闭的测试程序来决定无效,线性,FP1或FP2功能。有影响力的观测(IP)和较小的样本量都可以对选定的分数多项式模型产生影响。在本文中,我们使用了具有六个连续和四个分类预测指标的模拟数据来说明方法,这些方法可以帮助识别IP对功能选择和MFP模型的影响。方法使用剩余或两个分离和两种相关技术进行多变量评估。在七个子样本中,我们还研究了样本量和模型可复制性的影响。为了更好地说明,使用结构化轮廓来概述进行的所有分析。结果表明,一个或多个IP可以驱动所选功能和模型。此外,MFP可能无法检测到非线性功能,并且所选模型可能与真实的基础模型有很大差异。但是,如果样本量足够并仔细地进行回归诊断,MFP可以是选择变量和功能形式的合适方法。
The multivariable fractional polynomial (MFP) procedure combines variable selection with a function selection procedure (FSP). For continuous variables, a closed test procedure is used to decide between no effect, linear, FP1 or FP2 functions. Influential observations (IPs) and small sample size can both have an impact on a selected fractional polynomial model. In this paper, we used simulated data with six continuous and four categorical predictors to illustrate approaches which can help to identify IPs with an influence on function selection and the MFP model. Approaches use leave-one or two-out and two related techniques for a multivariable assessment. In seven subsamples we also investigated the effects of sample size and model replicability. For better illustration, a structured profile was used to provide an overview of all analyses conducted. The results showed that one or more IPs can drive the functions and models selected. In addition, with a small sample size, MFP might not be able to detect non-linear functions and the selected model might differ substantially from the true underlying model. However, if the sample size is sufficient and regression diagnostics are carefully conducted, MFP can be a suitable approach to select variables and functional forms for continuous variables.