学术文献库

论文标题

安全且具有隐私的自动化机器学习操作，以作为糖尿病预测的端到端集成的物联网 - 肉文智能链链监测系统

Secure and Privacy-Preserving Automated Machine Learning Operations into End-to-End Integrated IoT-Edge-Artificial Intelligence-Blockchain Monitoring System for Diabetes Mellitus Prediction

论文作者

Hennebelle, Alain, Ismail, Leila, Materwala, Huned, Kaabi, Juma Al, Ranjan, Priya, Janardhanan, Rajiv

论文摘要

糖尿病是全世界死亡的主要原因之一，迄今为止无法治愈，并且可能导致严重的健康并发症，例如视网膜病变，肢体截肢，心血管疾病和神经元疾病，如果未经治疗，因此，采取预防措施避免/预测糖尿病的发生变得至关重要。在文献中，已经提出并评估了机器学习方法的糖尿病预测。本文提出了一个基于风险因素的糖尿病预测的物联网边缘人工智能（AI）窗口链系统。区块链为拟议的系统提供了基础，以获得来自不同医院患者的风险因素数据的凝聚力观点，并确保用户数据的安全和隐私。此外，我们提供了对不同医疗传感器，设备和方法的比较分析，以测量和收集系统中的风险因素值。使用最准确的随机森林（RF）模型在我们提出的系统之间进行了数值实验和比较分析，并使用三个现实生活中的糖尿病数据集在我们所提出的系统模型（RF）模型，逻辑回归（LR）和支持向量机（LR）和支持向量机（SVM）之间进行了。结果表明，与LR和SVM相比，使用RF的拟议系统平均预测糖尿病的准确度高4.57％，执行时间的糖尿病为2.87倍。没有特征选择的数据平衡不会显示出显着改善。 PIMA Indian和Sylhet数据集的功能选择后，性能分别提高了1.14％和0.02％，而模拟III的性能则减少了0.89％。

Diabetes Mellitus, one of the leading causes of death worldwide, has no cure to date and can lead to severe health complications, such as retinopathy, limb amputation, cardiovascular diseases, and neuronal disease, if left untreated. Consequently, it becomes crucial to take precautionary measures to avoid/predict the occurrence of diabetes. Machine learning approaches have been proposed and evaluated in the literature for diabetes prediction. This paper proposes an IoT-edge-Artificial Intelligence (AI)-blockchain system for diabetes prediction based on risk factors. The proposed system is underpinned by the blockchain to obtain a cohesive view of the risk factors data from patients across different hospitals and to ensure security and privacy of the user's data. Furthermore, we provide a comparative analysis of different medical sensors, devices, and methods to measure and collect the risk factors values in the system. Numerical experiments and comparative analysis were carried out between our proposed system, using the most accurate random forest (RF) model, and the two most used state-of-the-art machine learning approaches, Logistic Regression (LR) and Support Vector Machine (SVM), using three real-life diabetes datasets. The results show that the proposed system using RF predicts diabetes with 4.57% more accuracy on average compared to LR and SVM, with 2.87 times more execution time. Data balancing without feature selection does not show significant improvement. The performance is improved by 1.14% and 0.02% after feature selection for PIMA Indian and Sylhet datasets respectively, while it reduces by 0.89% for MIMIC III.