论文标题

使用双级机器学习框架的事件持续时间预测,并具有离群的删除和外部连接优化

Incident duration prediction using a bi-level machine learning framework with outlier removal and intra-extra joint optimisation

论文作者

Grigorev, Artur, Mihaita, Adriana-Simona, Lee, Seunghyeon, Chen, Fang

论文摘要

由于事件的随机性,预测交通事故的持续时间是一项艰巨的任务。准确预测事故将持续多长时间的能力可以为两位最终用户的路线选择和交通操作经理处理非持续交通拥堵的效果提供重大好处。本文提出了一个新型的双层机器学习框架,通过拆卸和extra内部的优化进行了增强,以预测针对悉尼,澳大利亚和澳大利亚和桑弗朗西斯科的动脉道路和高速公路收集的三个异构数据集的事件持续时间。我们发现短期与长期交通量持续时间之间的最佳阈值,以班级平衡和预测性能为目标,同时还比较了二进制与多类分类方法。其次,为了使入射持续时间预测到分钟水平的更多粒度,我们提出了一种新的脱发关节优化算法(IEO-ML),该算法(IEO-ML)扩展了针对数据集中多个回归方案测试的多个基线ML模型。最终结果表明:a)40-45分钟是识别短期事件和长期事件的最佳分配阈值,并且应分别对这些事件进行建模,b)我们提出的IEO-ML方法在所有案件中都显着优于$ 66 \%$的基线ML模型,以表明其准确的出现事件持续时间的潜在潜力。最后,我们评估了特征的重要性,并证明了时间,位置,事件类型,事件报告来源和天气,这是10个关键因素,这些因素会影响事件持续多长时间。

Predicting the duration of traffic incidents is a challenging task due to the stochastic nature of events. The ability to accurately predict how long accidents will last can provide significant benefits to both end-users in their route choice and traffic operation managers in handling of non-recurrent traffic congestion. This paper presents a novel bi-level machine learning framework enhanced with outlier removal and intra-extra joint optimisation for predicting the incident duration on three heterogeneous data sets collected for both arterial roads and motorways from Sydney, Australia and San-Francisco, U.S.A. Firstly, we use incident data logs to develop a binary classification prediction approach, which allows us to classify traffic incidents as short-term or long-term. We find the optimal threshold between short-term versus long-term traffic incident duration, targeting both class balance and prediction performance while also comparing the binary versus multi-class classification approaches. Secondly, for more granularity of the incident duration prediction to the minute level, we propose a new Intra-Extra Joint Optimisation algorithm (IEO-ML) which extends multiple baseline ML models tested against several regression scenarios across the data sets. Final results indicate that: a) 40-45 min is the best split threshold for identifying short versus long-term incidents and that these incidents should be modelled separately, b) our proposed IEO-ML approach significantly outperforms baseline ML models in $66\%$ of all cases showcasing its great potential for accurate incident duration prediction. Lastly, we evaluate the feature importance and show that time, location, incident type, incident reporting source and weather at among the top 10 critical factors which influence how long incidents will last.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源