论文标题

随机竞争风险森林供大型数据

Random Competing Risks Forests for Large Data

论文作者

Therrien, Joel, Cao, Jiguo

论文摘要

随机森林是一个明智的非参数模型,可以根据一些协变量来预测竞争风险数据。但是,目前没有可以充分处理大型数据集的软件包($ n> 100,000美元)。我们使用Ishwaran等人开发的随机竞争风险介绍了一个新的R包,更大的CRF。 (2014)。我们通过模拟研究验证了包装的有效性和准确性,并表明其结果与Randomforestsrc相似,同时花费更少的时间进行运行。我们还使用大多数研究人员可用的硬件要求,在以前无法访问的大型数据集上演示了包装。

Random forests are a sensible non-parametric model to predict competing risk data according to some covariates. However, there are currently no packages that can adequately handle large datasets ($n > 100,000$). We introduce a new R package, largeRCRF, using the random competing risks forest theory developed by Ishwaran et al. (2014). We verify our package's validity and accuracy through simulation studies and show that its results are similar enough to randomForestSRC while taking less time to run. We also demonstrate the package on a large dataset that was previously inaccessible, using hardware requirements that are available to most researchers.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源