Browsing by Author "Kavinga, H. W. B."

Now showing 1 - 4 of 4

Analysing the Factors Affecting to Women’s Engagement in Sri Lankan Labor Force
(Faculty of Commerce and Management Studies University of Kelaniya., 2024-11-01) Nanayakkara, N. W. H. G. K. K.; Hewaarachchi, A. P.; Kavinga, H. W. B.; Wijebandara, W. A. C.
Women's engagement in the workforce is a key factor in driving economic growth in Sri Lanka. Despite the country's advancements in recent years, women still face significant challenges that prevent them from fully participating in the workforce. The objective of this study is to explores the potential labor force of women in Sri Lanka and the factors affecting their participation, using data from the Labor Force Survey 2021. Analyzing data from 41,171 women out of 77,869 individuals using a binary logistic regression model, the study considered factors such as marital status, education level, age group, relationship to the head of the household, district, sector, ethnic group, religion, Sinhala literacy, and English literacy. The results showed that all the variables except for Religion, are statistically significant. Married and widowed women are less likely to participate in the labor force compared to never-married women, while separated and divorced women are more likely to participate. Women in districts like Nuwara Eliya, Kilinochchi, Kurunegala, Anuradhapura, Badulla, and Rathnapura have higher labor force participation rates. There is a notable gender gap in labor force participation, with males participating more actively than females; over half of the working-age female population remains economically inactive. Females constitute most of the unemployed demographic. Despite being more prevalent in urban and rural areas, labor force participation rates are higher in the estate sector. Most women abstain from job searches due to household responsibilities and education levels. Nearly half of discouraged women are concentrated in younger age groups, with 26% aged 25-34 and 21% aged 35-54. The study underscores the necessity of policy interventions to address barriers to women's labor market participation, especially in household duties and education, to enhance Sri Lanka's female workforce potential and contribute to its economic and social development.
A comparison of distance-based and model-based clustering methods
(Faculty of Science, University of Kelaniya Sri Lanka, 2023) Nadeekantha, H. A. D. D.; Kavinga, H. W. B.; Gunawardana, A.; Dissanayaka, D. M. P. V.
Most of the statistical techniques assume the homogeneity of the sample data. However, not all the time, real-world samples are homogeneous. The existence of subgroups within a population leads to the non-homogeneity of the sample. In this case, it is not accurate to model the population using a single probability distribution. Hence it is essential to check the homogeneity of the sample. Clustering, an unsupervised learning technique, is being used to discover a population's subgroups and group each observation into a specific cluster. Mainly, clustering algorithms can be divided into two groups, namely model-based and distance-based algorithms. Model-based algorithms assume a probability distribution for clustering, while distance-based algorithms use a distance metric to classify observations into clusters. In the literature, it was suggested that the model-based clustering methods perform better than the distance-based methods using summary statistics and visualizations. In this study, an inference-based procedure has been used to assess the above claim. To compare the performances of model-based and distance-based algorithms, an extensive simulation study was conducted. In the simulation study, two univariate Gaussian mixtures with different parameter settings (mean, standard deviation, and sample size) were combined to generate a non-homogeneous sample. Then, model-based and distance-based algorithms were applied to the same simulated datasets with different cluster structures, knowing the actual cluster memberships. Further, the effect of bimodality conditions of Gaussian mixtures on both clustering methods was checked. To assess the performance of the two methods, identifying the correct number of clusters, Cluster Identification Ability (CIA), and categorizing the observations into the correct cluster memberships (clustering accuracy) were computed. CIA was computed using the percentage of iterations that identified the correct number of clusters, and clustering accuracy was measured using the Adjusted Rand Index (ARI). For most of the simulation settings, both methods required a sample size of less than 200 to achieve high clustering accuracy (approximately mean ARI value of 0.8). For example, a simulation setting with a mean difference of 3.1 and a standard deviation of 0.5 required sample sizes 20 and 10 for the model-based and distance-based methods, respectively. These minimum sample sizes vary depending on the method's high clustering accuracy, and in some cases, those are approximately the same. The inference-based study which is performed using the paired Wilcoxon signed-rank test indicated that the claim “model-based method outperforms distance-based method, or both performs similarly” is valid 82.7% of the time at a 5% level of significance. In conclusion, the CIA and clustering ability of the model-based method increased with the increment of sample size when the bimodality conditions were satisfied by the mixture. For the distance-based method, both abilities decreased as the sample size increased when the bimodality conditions were not satisfied by the sample.
Identification of factors leading to elephant deaths in human-elephant conflicts
(Faculty of Science, University of Kelaniya Sri Lanka, 2023) Lakshitha, W. A. D. M.; Chandrasekara, N. V.; Kavinga, H. W. B.; Withanage, N.
Human-elephant conflicts (HEC) have emerged as one of the main challenges that Sri Lanka faces throughout several decades. According to the official data of the Department of Wildlife Conservation (DWC), the number of elephant deaths is higher than the number of human deaths due to HEC per year. This research focused on the North Central Province, where the highest number of elephant deaths have been recorded. Hence, the objectives of this research are to identify the main factors that have affected the deaths of elephants and to identify suitable models to predict the causes of elephant deaths due to human-elephant conflict. Although there has been much research related to HEC worldwide, no published research studies were found in the literature that utilized advanced statistical techniques such as Multinomial Logistic Regression (MLR), LASSO regression, Decision Tree (DT), Support Vector Machine (SVM), and Probabilistic Neural Network (PNN) for their studies. However, this research will address that research gap by constructing models for classifying the causes of elephant deaths resulting from HEC. Data was collected from various departments, including DWC, the Department of Meteorology, and the crop calendar of the Department of Agriculture. Furthermore, Pearson's Chi-square and Fisher's exact tests were used to identify the association between the cause of death and influencing factors. Five variables, including the elephant age group, grass levels, gender, rainfall season, and place of death, were found to significantly influence the causes of death of an elephant. MLR and Data Mining (DM) techniques were initially utilized, but due to multicollinearity arising in MLR, the LASSO technique was employed as a remedial method. To overcome the class imbalanced problem, 90% of the data were randomly selected for model building while maintaining the class ratio of the response variable, and the remaining 10% of the data were used for testing. Performance measures, overall classification accuracy (OCA), and Misclassification Percentage of Critical Cases (MPCC) were used to evaluate and compare the classification potential of models. Models such as final MLR, LASSO, DT, SVM with Polynomial and Gaussian Kernels, and PNN with spread 0.801 illustrated 42.30%, 50%, 53.84%, 69.23%, 73.07%, and 73.07% of OCA. In addition, the above models showed 34.61%, 30.76%, 7.69%, 11.53%, 19.23%, and 26.92% MPCC respectively. Finally, the SVM model with Gaussian Kernel exhibited high OCA (73.07%) with 19.23% of MPCC as the better model since the PNN showed a high MPCC of about 26.92%. These findings will be helpful for authorities in their future and existing projects.
Predicting a top rank batsman in an ODI match, using the first few balls faced: A case study
(Faculty of Science, University of Kelaniya Sri Lanka, 2022) Madhuranga, W. P. K.; Kavinga, H. W. B.; Chandrasekara, N. V.
Predicting the success of a top-rank batsman will play a crucial role in the decision-making process in the game of cricket, on the field as well as off the field. This research is carried out with the purpose of achieving the aforementioned task. The proposed procedure explicitly followed to rank one, two and three players in the world by August 2021. Therefore, the results cannot be generalized to a wider set of players. Among several models tried out, Decision Tree (DT) model with a training ratio of 0.9 showed the highest accuracy of 72% in predicting whether the batsman will be successful, i.e., scoring fifty or more runs on a given day. Probabilistic Neural Network (PNN) and Support Vector Machine (SVM) models with a similar test ratio resulted in an accuracy of around 65% for the three players, Rohit Sharma, Babar Azam and Virat Kholi. PNN recorded a maximum accuracy of 64.2% when predicting the performance of Rohit Sharma and the SVM model recorded a maximum accuracy of 59% when predicting the success of Babar Azam. The aforementioned accuracy of the DT model was achieved using the first five balls for Virat Kholi and Rohit Sharma and the first seven balls for Babar Azam. The findings of the study can be used to make accurate decisions in the game of cricket.