论文标题

跨栏康威 - 马克斯韦 - 波森分布的灵活建模,并应用于采矿伤害

Flexible Modeling of Hurdle Conway-Maxwell-Poisson Distributions with Application to Mining Injuries

论文作者

Yin, Shuang, Dey, Dipak K., Valdez, Emiliano A., Li, Xiaomeng

论文摘要

虽然跨栏泊松回归是一种流行的模型,用于计数数据过多的数据,但二进制组件中的链接函数可能不适合高度不平衡的情况。普通的泊松回归无法处理分散的存在。在本文中,我们介绍了Conway-Maxwell-Poisson(CMP)分布,并整合使用灵活偏斜的Weibull Link函数作为更好的替代方案。我们采用一种完全贝叶斯的方法来从基础模型中提取推断,以更好地解释偏度和量化分散体,并使用用于模型选择的偏差信息标准(DIC)。为了进行实证研究,我们分析了美国矿山安全与健康管理局(MSHA)2013 - 2016年期间的采矿损伤数据。描述每种采矿工作中花费的员工时间比例的风险因素是组成数据;概率主成分分析(PPCA)被部署以处理此类协变量。还对CMP的障碍回归进行调整,以通过总员工工作时间来衡量的暴露,以推断采矿损伤率;我们测试了其对其他模型的竞争力。这可以用作采矿工作场所中的预测模型,以识别增加伤害风险的特征,从而可以实施预防。

While the hurdle Poisson regression is a popular class of models for count data with excessive zeros, the link function in the binary component may be unsuitable for highly imbalanced cases. Ordinary Poisson regression is unable to handle the presence of dispersion. In this paper, we introduce Conway-Maxwell-Poisson (CMP) distribution and integrate use of flexible skewed Weibull link functions as better alternative. We take a fully Bayesian approach to draw inference from the underlying models to better explain skewness and quantify dispersion, with Deviance Information Criteria (DIC) used for model selection. For empirical investigation, we analyze mining injury data for period 2013-2016 from the U.S. Mine Safety and Health Administration (MSHA). The risk factors describing proportions of employee hours spent in each type of mining work are compositional data; the probabilistic principal components analysis (PPCA) is deployed to deal with such covariates. The hurdle CMP regression is additionally adjusted for exposure, measured by the total employee working hours, to make inference on rate of mining injuries; we tested its competitiveness against other models. This can be used as predictive model in the mining workplace to identify features that increase the risk of injuries so that prevention can be implemented.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源