A COMPARATIVE STUDY OF PREDICTIVE SUPERVISED-MACHINE LEARNING ALGORITHMS ON CARDIOVASCULAR DISEASES (CVD)

Main Article Content

Alina Faisal
Dr. Syed E. Ahmed
Mahnaz Makhdum
Dr. Farah Naz Makhdum

Keywords

heart failure, supervised machine learning, decision tree, CVD, feature importance, shrinkage estimators

Abstract

Supervised machine learning (S-ML) applications in the medical niche assist in attenuating the fatality rates as it is an arduous challenge for cardiologists to predict the patterns from the clinical data. The objective of this comparative research study is to determine the best S-ML algorithm amongst the full model, submodel I and II for the prediction of the incidence of death events due to heart failure. S-ML classification algorithms including logistic regression, ridge classifier, random forest, decision tree, and SVM with and without hyperparameter tuning using GridSearch CV were applied to a Kaggle dataset for extensive performance analysis. Feature importance techniques including random forest feature selection, and SHAP approach using the XGBoost library were used to create two submodels. The conclusion drawn from the predicted results suggested decision tree as the best algorithm due to its highest accuracy (78%, 77%, 74%) and least root-mean-square error (RMSE) (0.471, 0.483, 0.506) among the S-ML algorithms implemented on all the models. To the best of our knowledge, the implementation of linear and James-Stein shrinkage estimator strategies is the first empirical analysis of a CVD dataset. It showed submodel II as the best fit, and BIC scores showed submodel I as a better-performing model.

Abstract 451 | pdf Downloads 153

References

1. Bhatt, C. M.; Patel, P.; Ghetia, T.; Mazzeo, P.L. Effective Heart Disease Prediction Using Ma-chine Learning Techniques. Algorithms 2023, 16(2), 88. DOI: https://doi.org/10.3390/a16020088.
2. Heart Attack and Stroke Symptoms. Available online: https://www.heart.org/en/health-topics/consumer-healthcare/what-is-cardiovascular-disease (accessed on 10 June 2023).
3. World Health Organization. Available online: https://www.who.int/health-topics/cardiovascular-diseases#tab=tab_1 (accessed on 15 June 2023).
4. Kerexeta, J.; Larburu, N.; Escolar, V.; Lozano-Bahamonde, A.; Macía, I.; Iraola, A. B.; Graña, M. Prediction and Analysis of Heart Failure Decompensation Events Based on Telemonitored Data and Artificial Intelligence Methods. Journal of Cardiovascular Development and Disease 2023. 10(2), 48. DOI: https://doi.org/10.3390/jcdd10020048.
5. Arunachalam, S.K.; Rekha, R. A novel approach for cardiovascular disease prediction using machine learning algorithms. Concurrency and Computation Practice and Experience 2022. 34 (19). DOI: https://doi.org/10.1002/cpe.7027.
6. Shou, B.L.; Chatterjee, D.; Russel, J. W.; Zhou, A.L.; Florissi, I, S.; Lewis, T.; Verma, A.; Benharash, P.; and Choi. C. W. Pre-operative Machine Learning for Heart Transplant Patients Bridged with Temporary Mechanical Circulatory Support. Journal of Cardiovascular Devel-opment and Disease 2022, 9(9), 311. DOI: https://doi.org/10.3390/jcdd9090311.
7. Chen, X.; Ishwaran, H. Random forests for genomic data analysis. Genomics 2012, 99(6), 323-329. DOI: https://doi.org/10.1016/j.ygeno.2012.04.003.
8. Faizal, A.S.M.; Thevarajah, T.M.; Khor, S. M.; Chang, S. A review of risk prediction models in cardiovascular disease: conventional approach vs. artificial intelligent approach. Computer Methods and Programs in Biomedicine 2021, 207. DOI: https://doi.org/10.1016/j.cmpb.2021.106190.
9. Kavitha, M.; Gnaneswar, G.; Dinesh, R.; Sai, Y.R.; Suraj, R.S. Heart Disease Prediction using Hybrid machine Learning Model. In Proceedings of 6th International Conference on Inventive Computation Technologies (ICICT), Coimbatore, India, Date of Conference (20-22 January 2021). DOI: 10.1109/ICICT50816.2021.9358597.
10. Chua, S.; Sia, V.; Nohuddin, P. N. E. Comparing Machine Learning Models for Heart Disease Prediction. In Proceedings of IEEE International Conference on Artificial Intelligence in En-gineering and Technology (IICAIET), Kota Kinabalu, Malaysia, Date of Conference (13-15 September 2022). DOI: 10.1109/IICAIET55139.2022.9936861.
11. Drożdż, K.; Nabrdalik, K.; Kwiendacz, H.; Hendel, M.; Olejarz, A.; Tomasik, A.; Bartman, W.; Nalepa, J.; Gumprecht, J.; Lip, G. Y. H. Risk factors for cardiovascular disease in patients with metabolic-associated fatty liver disease: a machine learning approach. Cardiovascular Diabetology 2022, 21. DOI: https://doi.org/10.1186/s12933-022-01672-9.
12. Amin, M.S.; Chiam, Y.K.; Varathan, K. D. Identification of significant features and data min-ing techniques in predicting heart disease. Telematics and Informatics 2019, 36, 82-93. DOI: https://doi.org/10.1016/j.tele.2018.11.007.
13. Dritsas, E.; Alexiou, S.; Moustakas, K. Cardiovascular Disease Risk Prediction with Super-vised Machine Learning Techniques. In Proceedings of 8th International Conference on Infor-mation and Communication Technologies for Ageing Well and e-Health, Greece, Date of Con-ference (April 2022). DOI: 315-321. 10.5220/0011088300003188.
14. Yousefi S. Comparison of the performance of machine learning algorithms in predicting heart disease. Frontier Health Informatics 2021, 10. DOI: https://doi.org/10.30699/fhi.v10i1.349.
15. Ahmed, H.; Younis, E. M. G.; Hendawi, A.; Ali, A.A. Heart disease identification from pa-tients’ social posts, machine learning solution on Spark. Future Generation Computer Systems 2020, 111, 714-722. DOI: https://doi.org/10.1016/j.future.2019.09.056.
16. Salah, H.; Srinivas, S. Explainable machine learning framework for predicting long-term car-diovascular disease risk among adolescents. Scientific Reports 2022, 12. DOI: https://doi.org/10.1038/s41598-022-25933-5.
17. Gonsalves, A. H.; Thabtah, F.; Mohammad, R.M.A.; Singh, G. Prediction of Coronary Heart Disease using Machine Learning: An Experimental Analysis. In Proceedings of the 3rd Inter-national Conference on Deep Learning Technologies, Association for Computing Machinery, New York, USA, Date of Conference (July 2019). DOI: https://doi.org/10.1145/3342999.3343015.
18. Hasan, M.A.M.; Shin, J.; Das, U.; Srizon, A. Y. Identifying Prognostic Features for Predicting Heart Failure by Using Machine Learning Algorithm. In Proceedings of the 11th International Conference on Biomedical Engineering and Technology, Association for Computing Machin-ery USA, Date of Conference (March 2021). DOI: https://doi.org/10.1145/3460238.3460245.
19. Kaggle. Available online: https://www.kaggle.com/datasets/andrewmvd/heart-failure-clinical-data?resource=download&select=heart_failure_clinical_records_dataset.csv (accessed on 7 June 2023).
20. Ramesh, TR.; Lilhore, U.K.; M, P.; Simaiya, S.; Kaur, A.; Hamdi, M. PREDICTIVE ANALY-SIS OF HEART DISEASES WITH MACHINE LEARNING APPROACHES. Malaysian Jour-nal of Computer Science 2022, 132-148. DOI: https://doi.org/10.22452/mjcs.sp2022no1.10.
21. Trabay, D.; Gharibi, W.; Abd-Elhafiez, W. M. Effective Models for Predicting Heart Disease Using Machine Learning Techniques – A Comparative Study. Information Sciences Letters 2023, 12, 1561-1572. DOI: 10.18576/isl/120505.
22. Reddy, K.V.V.; Elamvazuthi, I.; Aziz, A.A.; Paramasivam, S.; Chua, H.N.; Pranavanand, S. Heart Disease Risk Prediction Using Machine Learning Classifiers with Attribute Evaluators. Applied Sciences 2021, 11, 8352. DOI: https://doi.org/10.3390/app11188352.
23. Silva, M. P. Feature Selection using SHAP: An Explainable AI approach. Undergraduate The-sis, University of Brasília, Brazil, 2021. DOI: https://doi.org/10.1111/bjop.1226.
24. Ahmed, S.E. Penalty, Shrinkage and Pretest Strategies, 1st ed.; Springer Cham: Brock Univer-sity, St. Catherines, Canada, 2014; pp. 115. DOI: https://doi.org/10.1007/978-3-319-03149-1.
25. Ahmed, S.E.; Ahmed, F.; Yusbasi, B. Post-Shrinkage Strategies in Statistical and Machine Learning for High Dimensional Data, 1st ed.; CRC Press: Boca Raton, Abingdon, Oxon, USA, England, 2023; pp. 1-107.
26. Yuzbasi, B.; Asar, Y.; Ahmed, S.E. Liu-type shrinkage estimations in linear models. Statistics 2022, 56(2) 396-420. DOI: https://doi.org/10.1080/02331888.2022.2055030.
27. Medium. Available online: https://towardsdatascience.com/the-likelihood-ratio-test-463455b34de9 (accessed on 28 August 2023).