Segmenting Graduate Unemployment: A clustering approach to understanding employment challenges in Sri Lanka
| dc.contributor.author | Thilakarathne, D. G. S. P. | |
| dc.contributor.author | Rajapaksha, R. R. L. U. I. | |
| dc.date.accessioned | 2026-01-16T06:52:42Z | |
| dc.date.issued | 2025 | |
| dc.description.abstract | This study investigates the demographic and socioeconomic factors influencing graduate unemployment in Sri Lanka through an unsupervised machine learning approach. Data from a structured national survey of 1,250 unemployed graduates collected between January 2022 and December 2023 were analyzed using Principal Component Analysis (PCA) and three clustering algorithms K-means, Hierarchical clustering (Ward's linkage), and Gaussian Mixture Models (GMM) to segment the graduate population into meaningful subgroups. After dimensionality reduction, GMM with six clusters was identified as the most suitable configuration, as indicated by strong validation metrics including the Silhouette Score, Davies Bouldin Index, and Calinski Harabasz Index. The six clusters were found to exhibit distinct demographic and educational profiles. Clusters 0, 2, and 3 consisted of younger graduates in internal study programs, with shorter unemployment durations (1.60-2.33 months) and residences primarily in urban or semi urban districts. Clusters 1, 4, and 5 comprised older, externally oriented graduates with longer unemployment durations (3.13-3.62 months), often located in rural districts. Gender based patterns were also observed: females in internal programs were associated with shorter unemployment durations (1.6-1.9 months) compared with males (2.3 months). Business and Management graduates were predominantly concentrated in urban districts such as Gampaha and Kandy, whereas Arts and Humanities graduates were largely situated in rural districts including Kurunegala and Batticaloa. The findings of this study provide a significant contribution by offering a data driven segmentation of unemployed graduates, enabling policymakers to design more precisely targeted interventions to reduce structural disparities in graduate employability. | |
| dc.identifier.citation | Thilakarathne, D. G. S. P., & Rajapaksha, R. R. L. U. I. (2025). Segmenting graduate unemployment: A clustering approach to understanding employment challenges in Sri Lanka. Proceedings of the 3rd International Conference in Data Science 2025. Center for Data Science, University of Colombo, Sri Lanka. (p. 65). | |
| dc.identifier.uri | http://repository.kln.ac.lk/handle/123456789/31103 | |
| dc.publisher | Center for Data Science, University of Colombo, Sri Lanka | |
| dc.subject | Clustering algorithms | |
| dc.subject | Gaussian mixture models | |
| dc.subject | Principal component analysis | |
| dc.subject | Socio-economic segmentation | |
| dc.subject | Unsupervised learning | |
| dc.title | Segmenting Graduate Unemployment: A clustering approach to understanding employment challenges in Sri Lanka | |
| dc.type | Article |