Cardio forecast: A multi criteria weighted voting ensemble model for the prediction of cardiovascular disease level.

Kitchilan, F. Z.; Samankula, W. G. D. M.

Please use this identifier to cite or link to this item: http://repository.kln.ac.lk/handle/123456789/18231

Full metadata record

DC Field	Value	Language
dc.contributor.author	Kitchilan, F. Z.	-
dc.contributor.author	Samankula, W. G. D. M.	-
dc.date.accessioned	2017-11-22T10:31:38Z	-
dc.date.available	2017-11-22T10:31:38Z	-
dc.date.issued	2017	-
dc.identifier.citation	Kitchilan, F. Z., and Samankula, W. G. D. M. (2017). Cardio forecast: A multi criteria weighted voting ensemble model for the prediction of cardiovascular disease level. International Research Symposium on Pure and Applied Sciences, 2017 Faculty of Science, University of Kelaniya, Sri Lanka.p.142.	en_US
dc.identifier.uri	http://repository.kln.ac.lk/handle/123456789/18231	-
dc.description.abstract	Computational intelligence and data mining in medical domain involve the discovery of unknown patterns and useful knowledge from massive sets of healthcare data. Different machine learning algorithms and statistical analysis tools perform diagnostic procedures. This research presents an intelligent medical knowledge driven approach for the prognosis of heart disease using a weighted voting ensemble classifier. The proposed ensemble classifier overcomes the limitations of single and hybrid data mining techniques by combining outcomes of three heterogeneous classifiers, namely Gaussian Naïve Bayes, Extra Tress and Ada Boost. The clinical data was obtained from processed Cleveland heart disease data set at UCI (University of California, Irvine) Machine Learning Repository. Different pre-processing techniques are used to clean data. There are thirteen (13) dependent attributes and one (1) independent attribute for the prediction in heart disease dataset. This study executes as quantitative research and many related works on data mining and heart disease reviewed and analyzed methodically to get a deep insight of the field. This study is based on statistics and numeric, therefore the research is quantifiable and various relations found between clinical data features. The main scope of study is to enhance the overall accuracy of cardiovascular disease prediction. The original dataset holds predictable attributes ranging from 0 to 4, a healthy heart beginning from 0 to a severely unhealthy heart at 4. In the analysis, the multiclass label changed to binary class. The predicted class either 0 or 1, which means the heart is either 0 ‘negative heart disease’ or 1 ‘positive heart disease’. K-fold cross validation technique is used to evaluate the performance of a proposed ensemble model. The weighted voting ensemble classifier carries out high categorization and diagnosis accuracy by enhancing several performance metrics to a reasonable margin when comparing to other individual and ensemble classifiers. It is obvious from the outcomes that the proposed voting classifier deals with all types of attributes and outperforms the highest accuracy of 92.11%, F1 score 91.89%, precision 92.24% and recall 92.47%. For the proper conclusion for this research to be applied clinically, it requires to further elaborate with the aid of professionals in both cardiac and machine learning domains.	en_US
dc.language.iso	en	en_US
dc.publisher	International Research Symposium on Pure and Applied Sciences, 2017 Faculty of Science, University of Kelaniya, Sri Lanka.	en_US
dc.subject	Ada Boost	en_US
dc.subject	Binary class classification	en_US
dc.subject	Extra Tress	en_US
dc.subject	Gaussian Naïve Bayes	en_US
dc.subject	K-fold cross validation	en_US
dc.title	Cardio forecast: A multi criteria weighted voting ensemble model for the prediction of cardiovascular disease level.	en_US
dc.type	Article	en_US
Appears in Collections:	IRSPAS 2017

Files in This Item:

File	Description	Size	Format
142.pdf		718.55 kB	Adobe PDF	View/Open

Show simple item record

DSpace JSPUI

DSpace preserves and enables easy and open access to all types of digital content including text, images, moving images, mpegs and data sets