An Ensemble Machine Learning Approach for Stroke Prediction

Premisha, P.; Prasanth, Senthan; Kanagarathnam, Mauran; Banujan, Kuhaneswaran

Please use this identifier to cite or link to this item: http://repository.kln.ac.lk/handle/123456789/25423

Title:	An Ensemble Machine Learning Approach for Stroke Prediction
Authors:	Premisha, P. Prasanth, Senthan Kanagarathnam, Mauran Banujan, Kuhaneswaran
Keywords:	bagging and boosting, ensemble, Machine Learning, medical informatics, stroke
Issue Date:	2022
Publisher:	Department of Industrial Management, Faculty of Science, University of Kelaniya Sri Lanka
Citation:	Premisha P.; Prasanth Senthan; Kanagarathnam Mauran; Banujan Kuhaneswaran (2022), An Ensemble Machine Learning Approach for Stroke Prediction, International Research Conference on Smart Computing and Systems Engineering (SCSE 2022), Department of Industrial Management, Faculty of Science, University of Kelaniya Sri Lanka. 165-170.
Abstract:	Nowadays, one out of four people above 25 will suffer from a stroke. Especially this year, with the highest count of around 13.7 million people discovered with stroke for the first time. Out of 13.7 million, 5.5 million were fatalities. This was stated in a recent WHO study. It is estimated that if no action is taken, the number of fatalities will rise to 6.7 million yearly. The pandemic situation of COVID-19 will play a significant cause in the expanded death rate of stroke. Even for adults and patients with minor risk factors affected by stroke rather than in previous years. This study predicts the impact level of stroke with the development of an ensemble model by combining the various classifiers performed well in isolation. Predicting the stroke status in patients would help the physicians determine the prognosis and assist them in providing the targeted therapy in a limited time. During this study, an ensemble model was built by considering the base, bagging, and boosting classifiers: Support Vector Machine, Naïve Bayes, Decision Tree, Logistic Regression, Artificial Neural Network, Random Forest, XGBoost, LightGBM, and CatBoost. The dataset consists of 5110 patient details, along with 12 attributes that were analyzed in this research. The final ensemble model was developed by carrying out the methodology in two phases. During the first and second phases, the classifiers mentioned above were trained without hyper-parameter tuning and with hyperparameter tuning and tested against the fundamental evaluation matrices. During each phase, the classifier that produces the highest classification accuracy is discovered from the base, bagging, and boosting categories. From the results obtained, the final ensemble model was constructed using the Max Voting approach, which yielded an accuracy of 95.76%.
URI:	http://repository.kln.ac.lk/handle/123456789/25423
Appears in Collections:	Smart Computing and Systems Engineering - 2022 (SCSE 2022)

Files in This Item:

File	Description	Size	Format
SCSE 2022 25.pdf		17.25 kB	Adobe PDF	View/Open

Show full item record

DSpace JSPUI

DSpace preserves and enables easy and open access to all types of digital content including text, images, moving images, mpegs and data sets