Predictive modeling of 30-day readmission risk of diabetes patients by logistic regression, artificial neural network, and EasyEnsemble

Objective: To determine the most influential data features and to develop machine learning approaches that best predict hospital readmissions among patients with diabetes. Methods: In this retrospective cohort study, we surveyed patient statistics and performed feature analysis to identify the most...

Full description

Bibliographic Details
Published in:Asian Pacific Journal of Tropical Medicine
Main Authors: Xiayu Xiang, Chuanyi Liu, Yanchun Zhang, Wei Xiang, Binxing Fang
Format: Article in Journal/Newspaper
Language:English
Published: Wolters Kluwer Medknow Publications 2021
Subjects:
Online Access:https://doi.org/10.4103/1995-7645.326254
https://doaj.org/article/9b452c07ad8d4d189d637c3b373c071b
Description
Summary:Objective: To determine the most influential data features and to develop machine learning approaches that best predict hospital readmissions among patients with diabetes. Methods: In this retrospective cohort study, we surveyed patient statistics and performed feature analysis to identify the most influential data features associated with readmissions. Classification of all-cause, 30-day readmission outcomes were modeled using logistic regression, artificial neural network, and EasyEnsemble. F1 statistic, sensitivity, and positive predictive value were used to evaluate the model performance. Results: We identified 14 most influential data features (4 numeric features and 10 categorical features) and evaluated 3 machine learning models with numerous sampling methods (oversampling, undersampling, and hybrid techniques). The deep learning model offered no improvement over traditional models (logistic regression and EasyEnsemble) for predicting readmission, whereas the other two algorithms led to much smaller differences between the training and testing datasets. Conclusions: Machine learning approaches to record electronic health data offer a promising method for improving readmission prediction in patients with diabetes. But more work is needed to construct datasets with more clinical variables beyond the standard risk factors and to fine-tune and optimize machine learning models.