一种混合方法：利用Kmeans聚类和天真的贝叶斯进行IoT异常检测

论文标题

一种混合方法：利用Kmeans聚类和天真的贝叶斯进行IoT异常检测

A Hybrid Approach: Utilising Kmeans Clustering and Naive Bayes for IoT Anomaly Detection

论文作者

Best, Lincoln, Foo, Ernest, Tian, Hui

论文摘要

物联网设备的扩散和种类繁多意味着它们已越来越成为恶意用户的可行目标。这创造了对可以在多个设备上工作的异常检测算法的需求。本文提出了可以在物联网系统中实现的当前异常检测算法的潜在替代方法，该算法可以在不同类型的设备上应用。该算法由机器学习的无竞争和监督机器区域组成，结合了每个机器的最强方面。该算法涉及攻击的初始K-均值聚类，并将其分配给集群。接下来，随后，Adaboost的幼稚贝叶斯监督学习算法将使用簇，以自我教育哪些数据应聚集到哪个数据。这通过在最终分类步骤之前添加群集数据来提高所提出的算法的准确性，从而确保更准确的算法。该拟议算法的正确凹痕百分比得分范围从90％到100％不等，并且对所提出的算法准确性，精度和回忆进行评分。这些高分获得了一种准确，灵活，可扩展，优化的算法，该算法有可能在不同的物联网设备中，从而确保强大的数据完整性和隐私。

The proliferation and variety of Internet of Things devices means that they have increasingly become a viable target for malicious users. This has created a need for anomaly detection algorithms that can work across multiple devices. This thesis suggests a potential alternative to the current anomaly detection algorithms to be implemented within IoT systems that can be applied across different types of devices. This algorithm is comprised of both unsupverised and supervised machine areas of machine learning combining the strongest facet of each. The algorithm involves the initial k-means clustering of attacks and assigns them to clusters. Next, the clusters are then used by the AdaBoosted Naive Bayes supervised learning algorithm in order to teach itself which piece of data should be clustered to which specific attack. This increases the accuracy of the proposed algorithm by adding clustered data before the final classification step, ensuring a more accurate algorithm. The correct indentification percentage scores for this proposed algorithm range anywhere from 90% to 100%, as well as rating the proposed algorithms accuracy, precision and recall. These high scores achieve an accurate, flexible, scalable, optimised algorithm that could potentially be in different IoT devices, ensuring strong data integrity and privacy.

下载PDF全文

下载文献需遵守相关版权规定

论文标题