Detection of β - Thalassemia carriers using data mining techniques
Date
2024
Journal Title
Journal ISSN
Volume Title
Publisher
The Institute of Applied Statistics, Sri Lanka
Abstract
Thalassemia, a genetic blood disorder, presents a significant challenge in Sri Lanka due to its high prevalence. Traditional methods of identifying tha-lassemia carriers, such as genetic and blood testing, are both costly and time-consuming, and potentially not available for certain demographic groups. However, there haven’t been many studies done on the efficacy of data mining models for thalassemia carrier detection, therefore the field is still in its in fancy. As such, evaluating their accuracy and utility in clinical practice is crucial. This study aims to develop a time-efficient model to detect the β-thalassemia carriers, which can reduce the time to take a decision and develop the built model as a decision support tool. Eight blood parameters - includ-ing RBC, HGB, HCT, MCV, MCH, MCHC, RDW, and HbA2 were selected based on literature. Two model-fitting approaches were introduced, each un-der different data selection methods: Method 1: Model fitting before handling the class imbalance problem and Method 02: Model fitting with random over-sampling technique. Support Vector Machine (SVM) and Probabilistic Neural Network (PNN) models were utilized for β-thalassemia carrier detection. Method 2 exhibited superior performance, especially with the PNN Model 2, achieving an impressive 98.75% overall classification accuracy. Moreover, the implemented PNN Model 2 could be utilized as an efficient decision-support tool, offering both time and cost savings in identifying β-thalassemia carriers. Nonetheless, for further investigation, consulting a medical expert is recommended.
Description
Not Indexed
Keywords
Class-imbalance, Support vector machine, Probabilistic neural network, β- thalassemia carriers
Citation
Subasinghe, G. K., Chandrasekara, N. V., & Premawardhena, A. P. (2024). Detection of β - Thalassemia Carriers using Data Mining Techniques. Sri Lankan Journal of Applied Statistics, 25(3), 147–161. https://doi.org/10.4038/sljas.v25i3.8145