ISSN :2582-9793

Missing Data Recovery in the e-health context based on Machine Learning models

Original Research (Published On: 10-Dec-2022 )
Missing Data Recovery in the e-health context based on Machine Learning models
DOI : 10.54364/AAIML.2022.1135

Ines Rahmany

Adv. Artif. Intell. Mach. Learn., 2 (4):516-532

Ines Rahmany : Faculty of Science and Techniques of Sidi Bouzid, University of Kairouan

Download PDF Here

DOI: 10.54364/AAIML.2022.1135

Article History: Received on: 19-Oct-22, Accepted on: 01-Dec-22, Published on: 10-Dec-22

Corresponding Author: Ines Rahmany

Email: ines.rahmani@fstsbz.u-kairouan.tn

Citation: Ines Rahmany (2022). Missing Data Recovery in the e-health context based on Machine Learning models. Adv. Artif. Intell. Mach. Learn., 2 (4 ):516-532

          

Abstract

    

Diabetes mellitus is a set of metabolic illnesses characterized by abnormally high blood sugar levels. In 2017, 8.8% of the world’s population had diabetes. By 2045, it is expected that this percentage will have risen to approximately 10%. Missing data, a prevalent problem even in a well-designed and controlled study, can have a major impact on the conclusions that can be derived from the available data. Missing data may decrease a study’s statistical validity and lead to erroneous results due to distorted estimations. In this study, we hypothesize that (a) replacing missing values using machine learning techniques rather than the mean value and group mean value and (b) using SVM kernel RBF classifier will result in the highest level of accuracy in comparison to traditional techniques such as DT, RF, NB, SVM, AdaBoost, and ANN. The classification results improved significantly when using regression to replace the missing values over the group median or the mean. This is a 10% improvement over previously developed strategies that have been reported in the literature.

Statistics

   Article View: 905
   PDF Downloaded: 13