Please use this identifier to cite or link to this item: http://repository.kln.ac.lk/handle/123456789/25536
Title: Comparison between the Machine Learning Algorithms to determine the suitable input features for personal theft, sexual assault, and house burglary victimization prediction
Authors: Lakshan, W. S. V.
Weerakoon, W. A. C.
Keywords: Correlation, Machine Learning, Neural Network, Severity, Victimization
Issue Date: 2022
Publisher: Faculty of Science, University of Kelaniya Sri Lanka
Citation: Lakshan W. S. V.; Weerakoon W. A. C. (2021) Comparison between the Machine Learning Algorithms to determine the suitable input features for personal theft, sexual assault, and house burglary victimization prediction, Proceedings of the International Conference on Applied and Pure Sciences (ICAPS 2021-Kelaniya) Volume 2, Faculty of Science, University of Kelaniya Sri Lanka. Page 54.
Abstract: The crime-related predictions can be vastly supported by most of the available supervised machine learning models. The possibility of becoming a victim increases daily in each crime category. The main difficulty is to find how severe the impact is upon the victim after the crime. Here, the Random Forest, Support Vector Machine (SVM), K-Nearest Neighbour (KNN) algorithm, and Neural Network models were compared with the use of available features found from a secondary dataset to build a better prediction model, which has been implemented in four main phases over two aspects based on the possibility of becoming a victim and severity of the crime. The available features were used as the inputs for phase I and Principal Component Analysis and correlation tests were performed to identify the appropriate and essential feature combinations for the rest of the phases. The pre-processed datasets were used to implement and train the models. Moreover, the Random Forest model was proven to be the most efficient model with an accuracy of 85.33% in phase four when comparing the accuracy levels of the models over different phases, while the KNN and Neural Network models obtained an accuracy of over 70% and SVM obtained the least accuracy in the same phase. In phase one, the Random Forest algorithm was executed with a precision of 76%, while KNN and Neural Network model obtained around 70%. The final outputs obtained for phase four showed that factors such as age, year, gender, race, and relationship with the perpetrator will be the most suitable features to build an accurate machine learning model for victimization prediction. The mentality level of the offender and intention of doing it has the main impact on the severity level. Also, authorities need to keep track of the fact whether it is a repeat offence or not, the main offender or not and the contribution of the offender to support better information inputs for the prediction models. This study developed a victimization prediction model with reference to personal theft, sexual assault, and house burglary. This would be a step forward from previous research works of rule-based victimization possibility index prediction for small victim clusters. Further, new features were identified in the last phase, which can be used to develop models to predict criminal behaviour after sending them back to the society. This will greatly benefit the authorized bodies to monitor them and reduce the possibility of victimization.
URI: http://repository.kln.ac.lk/handle/123456789/25536
Appears in Collections:ICAPS 2022

Files in This Item:
File Description SizeFormat 
ICAPS 2022 54.pdf166.67 kBAdobe PDFView/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.