论文标题
完善犯罪机器
Perfecting the Crime Machine
论文作者
论文摘要
这项研究使用不同的机器学习技术和工作流程探索,以预测与犯罪相关的统计数据,尤其是费城的犯罪类型。我们将犯罪位置和时间用作主要功能,从原始数据具有的两个功能中提取不同的功能,并构建可以与大量类标签一起使用的模型。我们使用不同的技术来提取各种功能,包括结合无监督的学习技术并尝试预测犯罪类型。我们使用的某些模型是支持向量机,决策树,随机森林,k-nearest邻居。我们报告说,随机森林是以2.3120的错误日志损失预测犯罪类型的最佳性能模型。
This study explores using different machine learning techniques and workflows to predict crime related statistics, specifically crime type in Philadelphia. We use crime location and time as main features, extract different features from the two features that our raw data has, and build models that would work with large number of class labels. We use different techniques to extract various features including combining unsupervised learning techniques and try to predict the crime type. Some of the models that we use are Support Vector Machines, Decision Trees, Random Forest, K-Nearest Neighbors. We report that the Random Forest as the best performing model to predict crime type with an error log loss of 2.3120.