论文标题

分析巴西患者的COVID19数据集的数据挖掘方法

Data Mining Approach to Analyze Covid19 Dataset of Brazilian Patients

论文作者

Saire, Josimar E. Chire

论文摘要

The pandemic originated by coronavirus(covid-19), name coined by World Health Organization during the first month in 2020. Actually, almost all the countries presented covid19 positive cases and governments are choosing different health policies to stop the infection and many research groups are working on patients data to understand the virus, at the same time scientists are looking for a vacuum to enhance imnulogy system to tack covid19 virus.巴西有更多感染的顶级国家之一是,直到8月11日,总共有3,112,393例。圣保罗州立大学(FAPESP)的研究基金会发布了一个数据集,它与医院(爱因斯坦,西里奥·利巴内斯),实验室(Fleury)和圣保罗大学合作,是一项创新的,以促进这一趋势主题。本文使用数据挖掘方法对数据集进行了探索性分析,并发现了一些不一致之处,即NAN值,分析物的无参考值,分析结果的异常值,编码问题。将结果清洗为未来的研究,但由于非参考范围不超出参考范围,至少有20%的数据被丢弃。

The pandemic originated by coronavirus(covid-19), name coined by World Health Organization during the first month in 2020. Actually, almost all the countries presented covid19 positive cases and governments are choosing different health policies to stop the infection and many research groups are working on patients data to understand the virus, at the same time scientists are looking for a vacuum to enhance imnulogy system to tack covid19 virus. One of top countries with more infections is Brazil, until August 11 had a total of 3,112,393 cases. Research Foundation of Sao Paulo State(Fapesp) released a dataset, it was an innovative in collaboration with hospitals(Einstein, Sirio-Libanes), laboratory(Fleury) and Sao Paulo University to foster reseach on this trend topic. The present paper presents an exploratory analysis of the datasets, using a Data Mining Approach, and some inconsistencies are found, i.e. NaN values, null references values for analytes, outliers on results of analytes, encoding issues. The results were cleaned datasets for future studies, but at least a 20\% of data were discarded because of non numerical, null values and numbers out of reference range.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源