ISSN :2582-9793

EHR Innovations: Shedding Light on Anemia in the Healthcare Paradigm

Original Research (Published On: 28-Sep-2024 )
EHR Innovations: Shedding Light on Anemia in the Healthcare Paradigm
DOI : https://dx.doi.org/10.54364/AAIML.2024.43154

Shambhab Chaki, Souhardya Das, Proma Mondal, Pratyusha Rakshit and Archana Chowdhury

Adv. Artif. Intell. Mach. Learn., 4 (3):2648-2664

Shambhab Chaki : Jadavpur University

Souhardya Das : Jadavpur University

Proma Mondal : Jadavpur University

Pratyusha Rakshit : Jadavpur University

Archana Chowdhury : Christian College of Engineering and Technology, Bhilai

Download PDF Here

DOI: https://dx.doi.org/10.54364/AAIML.2024.43154

Article History: Received on: 16-Jul-24, Accepted on: 21-Sep-24, Published on: 28-Sep-24

Corresponding Author: Shambhab Chaki

Email: shambhabc@gmail.com

Citation: Souhardya Das, Proma Mondal, Shambhab Chaki, Pratyusha Rakshit, Archana Chowdhury. (2024). EHR Innovations: Shedding Light on Anemia in the Healthcare Paradigm. Adv. Artif. Intell. Mach. Learn., 4 (3 ):2648-2664


Abstract

    

This study introduces a novel approach to Electronic Health Record (EHR) analysis, extending the use of phenotyping with machine learning (ML) models to enhance the recognition and treatment of anemia. It first examines the healthcare scenario in India and suggests potential improvements through data-driven personalized care. Using the MIMIC-III dataset, the research involves extensive data preprocessing and analysis to uncover key insights into anemia's prevalence, gender distribution, comorbidities, and Intensive Care Unit (ICU) stays. Partitioning clustering algorithms like K-Means, K-medoids, Fuzzy C-means, and hierarchical clustering algorithms such as Agglomerative Clustering, DIANA, and HDBSCAN are used to identify groups of patients with similar medical profiles. The distance metrics employed are Levenshtein and Euclidean distances combined with TF-IDF Vectorization. The effectiveness of these algorithms is evaluated based on Length of Stay (LoS) estimation, a critical parameter in EHR studies. To predict a new patient's LoS, the patient is at first classified into an existing cluster, which shows the highest support to the patient’s clinical activities. A decision tree regressor is then trained using data from the selected cluster to predict the new patient's LoS, significantly improving predictive accuracy and reliability. Notably, the HDBSCAN algorithm, applied to the Tf-Idf Vectorizer object, achieves a 50.82% reduction in Root Mean Squared Error (RMSE) compared to baseline model. The novelty of this study lies in proposing an efficient approach for EHR analysis, specifically for predicting ICU patients' LoS, and identifying the most effective clustering algorithm to improve healthcare delivery for anemic patients in healthcare scenario of India.

Statistics

   Article View: 271
   PDF Downloaded: 7