A HYBRID ENSEMBLE FRAMEWORK FOR CARDIAC DISEASE RISK STRATIFICATION WITH MACHINE LEARNING

Main Article Content

Muhammad Imran
Sadaqat Ali
Hadi Abdullah
Abdul Majid Soomro
Muhammad Ahsan Raza
Tahir Abbas

Keywords

Heart disease prediction;, Machine learning, Ensemble classifier, Hybrid Technique, Decision Tree, Naive Bayes, SVM, KNN, logistic regression, RF, Gradient Boosting, XGB

Abstract

Cardiovascular disease is one of the top health concerns to humanity and is gradually increasing daily. Predicting it timely and taking the necessary steps for its intervention is crucial. Precisely predicting cardiac disease is a challenging job that a human or application can do. The complexity of the cardiovascular system compels the use of Artificial Intelligence (AI) to find the solution. Machine learning techniques (sub-set of artificial intelligence) have done tremendous work in medical sciences by providing vast answers to their queries. Computer scientists have used different machine-learning methods for the identification of cardiac disease. This study aims to enhance the accuracy of the prophecy of cardiac disease to reduce the risk factors. It proposes a hybrid ensemble framework to analyze the cardiac data based on essential features for optimum prediction results. This ensemble framework uses multiple machine-learning classification methods to approach the optimal solution. This study uses the Cleveland open access dataset to discuss the working performance of famous classification techniques like Decision Tree, Naive Bayes, SVM, KNN, logistic regression, RF, Gradient Boosting, and XGB Classifier. It proposes a Hybrid Ensemble Framework based on this analysis to enhance the results. The proposed method shows incredible results using the Adaptive Boosting Ensemble technique. AdaBoost is used with hyperparameters on the results retrieved from the applied ML methods and gets more accuracy. The accuracy of this proposed method is evaluated using an open-access Cleveland dataset, which has various cardiac modalities, clinical records, and physiological measurements. Our proposed Hybrid Ensemble Framework achieved an accuracy of 91.80%, precision= 0.94, f1-score=0.92, macro avg= 0.92, and recall = 0.93. The results obtained by the other machine-learning algorithms are less than our model. The comparison of previously completed results is also examined to reflect the improvement in the proposed technique. Moreover, this technique opens new doors for real-world clinical solutions, and it advances the cardiac disease risk stratification field by introducing an innovative and applicable approach by merging ML and ensemble methods. The HEF enhances prediction accuracy and provides valuable insights into the key factors influencing cardiac disease risk, ultimately facilitating more informed clinical decision-making. Our findings underscore the potential of this hybrid ensemble framework as a valuable tool for improving the detection and management of cardiac diseases, ultimately reducing the burden of CVD (cardiovascular disease) on healthcare systems and society.

Abstract 250 | pdf Downloads 107

References

1. S.S. Virani, et al., Heart disease and stroke statistics, Update. Circulation, 2021 143 (8) (2021) e254–e743
2. Chu, M., Wu, P., Li, G., Yang, W., Gutiérrez-Chico, J. L., & Tu, S. (2023). Advances in Diagnosis, Therapy, and Prognosis of Coronary Artery Disease Powered by Machine Learning Algorithms. JACC: Asia, 3(1), 1-14.
3. Al’Aref, S. J., Singh, G., Choi, J. W., Xu, Z., Maliakal, G., van Rosendael, A. R., ... & Min, J. K. (2020). A boosted ensemble algorithm for determination of plaque stability in high-risk patients on coronary CTA. Cardiovascular Imaging, 13(10), 2162-2173.
4. Sowmiya C, Sumitra P. Analytical study of heart disease diagnosis using classification techniques. In: IEEE international conference on intelligent techniques in control, optimization and signal processing (INCOS), March 2017; 2017. p. 23–5. https://doi.org/10.1109/ITCOSP.2017.8303115.
5. J. Han and M. Kamber, "Data Mining Concepts and Techniques," Morgan Kaufmann Publishers, 2006.
6. I.-N. Lee, S.-C. Liao, and M. Embrechts, "Data mining techniques applied to medical information," Med. inform, 2000.
7. M. K. Obenshain, "Application of Data Mining Techniques to Healthcare Data," Infection Control and Hospital Epidemiology, 2004.
8. S. C. Liao and I. N. Lee, "Appropriate medical data categorization for data mining classification techniques," MED. INFORM, vol. 27, no. 1, pp. 59–67, 2002.
9. I. H. M. Paris, L. S. Affendey, and N. Mustapha, "Improving Academic Performance Prediction using Voting Technique in Data Mining," World Academy of Science, Engineering and Technology, 2010.
10. M. Diwakar, A. Tripathi, K. Joshi, M. Memoria, P. Singh, and N. Kumar, “Latest trends on heart disease prediction using machine learning and image fusion,” Mater. Today Proc., vol. 37, no. Part 2, pp. 3213–3218, 2020, doi: 10.1016/j.matpr.2020.09.078.
11. C. J. Harrison and C. J. Sidey-Gibbons, “Machine learning in medicine: a practical introduction to natural language processing,” BMC Med. Res. Methodol., vol. 21, no. 1, pp. 1–18, 2021, doi: 10.1186/s12874-021-01347-1
12. M. Pérez-Ortiz, S. Jiménez-Fernández, P. A. Gutiérrez, E. Alexandre, C. Hervás-Martínez, and S. Salcedo-Sanz, “A review of classification problems and algorithms in renewable energy applications,” Energies, vol. 9, no. 8, pp. 1–27, 2016, doi: 10.3390/en9080607.
13. S. Pandey, M. Supriya, and A. Shrivastava, “Data Classification Using Machine Learning Approach,” no. June, 2018, doi: 10.1007/978-3-319-68385-0.
14. L. Shahwan-Akl, "Cardiovascular Disease Risk Factors among Adult Australian-Lebanese in Melbourne," International Journal of Research in Nursing, 2010.
15. I. H. M. Paris, L. S. Affendey, and N. Mustapha, "Improving Academic Performance Prediction using Voting Technique in Data Mining," World Academy of Science, Engineering and Technology, 2010.
16. Shao YE, Hou CD, Chiu CC. Hybrid intelligent modeling schemes for heart disease classification. Appl Soft Comput 2014;14:47–52.
17. Yekkala I, Dixit S, Jabbar MA. August. Prediction of heart disease using ensemble learning and Particle Swarm Optimization. In: 2017 international conference on smart technologies for smart nation (SmartTechCon). IEEE; 2017. p. 691–8.
18. Amin SU, Agarwal K, Beg R. April. Genetic neural network-based data mining in the prediction of heart disease using risk factors. In: 2013 IEEE conference on information & communication technologies. IEEE; 2013. p. 1227–31.
19. Tan PN, Chawla S, Ho CK, Bailey J, editors. Advances in knowledge discovery and data mining, Part II: 16th Pacific-Asia conference, PAKDD 2012, Kuala Lumpur, Malaysia, may 29-June 1, 2012, Proceedings, Part II, vol. 7302. Springer; 2012.
20. Chandel K, Kunwar V, Sabitha S, Choudhury T, Mukherjee S. A comparative study on thyroid disease detection using K-nearest neighbor and Naive Bayes classification techniques. CSI Trans. ICT 2016;4(2–4):313–9.
21. Ambrish, G.; Ganesh, B.; Ganesh, A.; Srinivas, C.; Mensinkal, K. Logistic Regression Technique for Prediction of Cardiovascular Disease. Glob. Transit. Proc. 2022, 3, 127–130.
22. Perumal, R. Early Prediction of Coronary Heart Disease from Cleveland Dataset Using Machine Learning Techniques. Int. J. Adv. Sci. Technol. 2020, 29, 4225–4234.
23. Shah, S.M.S.; Batool, S.; Khan, I.; Ashraf, M.U.; Abbas, S.H.; Hussain, S.A. Feature action through Parallel Probabilistic Principal Component Analysis for Heart Disease Diagnosis. Phys. A Stat. Mech. Its Appl. 2017, 482, 796–807.
24. Kodati, S.; Vivekanandam, R. Analysis of Heart Disease Using in Data Mining Tools Orange and Weka Sri Satya Sai University Analysis of Heart Disease Using in Data Mining Tools Orange and Weka. Glob. J. Comput. Sci. Technol. C 2018, 18, 17–21.
25. Mohan, S.; Thirumalai, C.; Srivastava, G. Effective Heart Disease Prediction Using Hybrid Machine Learning Techniques. IEEE Access 2019, 7, 81542–81554.
26. Ananey-Obiri, D.; Sarku, E. Predicting the Presence of Heart Diseases Using Comparative Data Mining and Machine Learning Algorithms. Int. J. Comput. Appl. 2020, 176, 17–21.
27. Vijayashree, J.; Sultana, H.P. A Machine Learning Framework for Feature Selection in Heart Disease Classification Using Improved Particle Swarm Optimization with Support Vector Machine Classifier. Program. Comput. Softw. 2018, 44, 388–397.
28. Janosi,Andras, Steinbrunn,William, Pfisterer,Matthias, and Detrano,Robert. (1988). Heart Disease. UCI Machine Learning Repository. https://doi.org/10.24432/C52P4X.
29. Reddy, K.V.V.; Elamvazuthi, I.; Aziz, A.A.; Paramasivam, S.; Chua, H.N.; Pranavanand, S. Heart Disease Risk Prediction Using Machine Learning Classifiers with Attribute Evaluators. Appl. Sci. 2021, 11, 8352.
30. Pavithra, V.; Jayalakshmi, V. Hybrid Feature Selection Technique for Prediction of Cardiovascular Diseases. Mater. Today Proc. 2021; in press.
31. Ananey-Obiri, D.; Sarku, E. Predicting the Presence of Heart Diseases Using Comparative Data Mining and Machine Learning Algorithms. Int. J. Comput. Appl. 2020, 176, 17–21.
32. Ahamad, G.N.; Fatima, H.; Zakariya, S.M.; Abbas, M. Influence of Optimal Hyperparameters on the Performance of Machine Learning Algorithms for Predicting Heart Disease. Processes 2023, 11, 734.
33. Latha CBC, Jeeva SC. Improving the accuracy of prediction of heart disease risk based on ensemble classification techniques. Informatics in Medicine Unlocked Jan. 2019;16:100203. https://doi.org/10.1016/j.imu.2019.100203.
34. Mohan S, Thirumalai C, Srivastava G. Effective heart disease prediction using hybrid machine learning techniques. IEEE Access 2019;7:81542–54. https://doi.org/10.1109/ACCESS.2019.2923707.
35. Repaka AN, Ravikanti SD, Franklin RG. Design and implementing heart disease prediction using naives bayesian. In: 2019 3rd international Conference on Trends in Electronics and informatics (ICOEI); Apr. 2019. p. 292–7. https://doi.org/10.1109/ICOEI.2019.8862604.
36. N. Cheung, "Machine learning techniques for medical analysis," School of Information Technology and Electrical Engineering, B.Sc. Thesis, University of Queenland, 2001.
37. K. Polat, S. Sahan, and S. Gunes, "Automatic detection of heart disease using an artificial immune recognition system (AIRS) with fuzzy resource allocation mechanism and k-nn (nearest neighbour) based weighting preprocessing," Expert Systems with Applications,pp.625–663,.2007.
38. M. C. Tu, D. Shin, and D. Shin, "Effective Diagnosis of Heart Disease through Bagging Approach," Biomedical Engineering and Informatics, IEEE, 2009.Expert Systems with Applications, Elsevier, pp. 7675–7680, 2009.
39. Shouman, Mai & Turner, Timothy & Stocker, Rob. (2012). Applying k-Nearest Neighbour in Diagnosing Heart Disease Patients. International Journal of Information and Education Technology. 2. 220-223. 10.7763/IJIET.2012.V2.114.
40. Sharma, A.; Mishra, P.K. Performance Analysis of Machine Learning Based Optimized Feature Selection Approaches for Breast Cancer Diagnosis. Int. J. Inf. Technol. 2022, 14, 1949–1960.
41. M. Diwakar, A. Tripathi, K. Joshi, M. Memoria, P. Singh, and N. Kumar, “Latest trends on heart disease prediction using machine learning and image fusion,” Mater. Today Proc., vol. 37, no. Part 2, pp. 3213–3218, 2020, doi: 10.1016/j.matpr.2020.09.078.

Most read articles by the same author(s)

1 2 > >>