Browsing by Author "Senanayake, Janaka"

Now showing 1 - 4 of 4

Detection of IoT Malware Based on Forensic Analysis of Network Traffic Features
(Department of Industrial Management, Faculty of Science, University of Kelaniya Sri Lanka, 2022) Nimalasingam, Nisais; Senanayake, Janaka; Rajapakse, Chathura
The usage of Internet of Things (IoT) devices is getting unavoidable lately, from handheld devices to factory automated machines and even IoT embedded automotive vehicles. On average, 100+ devices are connected to the IoT world per second, and the volume of data generated by these devices and added to the space is just too enormous. The value of the data costs more, and sometimes it is invaluable, and it may pull over the cybercriminals and eventually increases the number of cybercrimes. Therefore, the need to identify malware in IoT is a timely requirement. This research work applies Machine Learning (ML) models and yields an efficient lead to identifying the IoT malware using forensic analysis of their network traffic features by selecting the foremost unique features and combining them with the binary features of the malware families. An outsized dataset with many network traffic collections used various network traffic features. Thus, the proposed model's detection accuracy of almost 100% was achieved from the model during the experimental phase of the study, which was a result of the feature extraction process for each malware type. This model can be further improved by considering the fog level implementation of the IoT layer, where the learning will help identify a malicious packet transfer to the network at level zero.
Estimation of the incubation period of COVID-19 using boosted random forest algorithm
(Department of Industrial Management, Faculty of Science, University of Kelaniya Sri Lanka, 2021) Rathnayake, P. P. P. M. T. D.; Senanayake, Janaka; Wickramaarachchi, Dilani
Coronavirus disease was first discovered in December 2019. As of July 2021, within nineteen months since this infectious disease started, more than one hundred and eighty million cases have been reported. The incubation period of the virus, severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), can be defined as the period between exposure to the virus and symptom onset. Most of the affected cases are asymptomatic during this period, but they can transmit the virus to others. The incubation period is an important factor in deciding quarantine or isolation periods. According to current studies, the incubation period of SARS-CoV-2 ranges from2 to 14 days. Since there is a range, it is difficult to identify a specific incubation period for suspected cases. Therefore, all suspected cases should undergo an isolation period of 14 days, and it may lead to unnecessarily allocation of resources. The main objective of this research is to develop a classification model to classify the incubation period using machine learning techniques after identifying the factors affecting the incubation period. Patient records within the age group 5-80 years were used in this study. The dataset consists of 500 patient records from various countries such as China, Japan, South Korea and the USA. This study identified that the patients' age, immunocompetent state, gender, direct/indirect contact with the affected patients and the residing location affect the incubation period. Several supervised learning classification algorithms were compared in this study to find the best performing algorithm to classify the incubation classes. The weighted average of each incubation class was used to evaluate the overall model performance. The random forest algorithm outperformed other algorithms achieving 0.78 precision, 0.84 recall, and 0.80 F1-score in classifying the incubation classes. To fine-tune the model AdaBoost algorithm was used.
Labelled Vulnerability Dataset on Android Source Code (LVDAndro) to Develop AI-Based Code Vulnerability Detection Models
(2023) Senanayake, Janaka; Kalutarage, Harsha; Al-Kadri, Mhd Omar; Piras, Luca; Petrovski, Andrei
Ensuring the security of Android applications is a vital and intricate aspect requiring careful consideration during development. Unfortunately, many apps are published without sufficient security measures, possibly due to a lack of early vulnerability identification. One possible solution is to employ machine learning models trained on a labelled dataset, but currently, available datasets are suboptimal. This study creates a sequence of datasets of Android source code vulnerabilities, named LVDAndro, labelled based on Common Weakness Enumeration (CWE). Three datasets were generated through app scanning by altering the number of apps and their sources. The LVDAndro, includes over 2,000,000 unique code samples, obtained by scanning over 15,000 apps. The AutoML technique was then applied to each dataset, as a proof of concept to evaluate the applicability of LVDAndro, in detecting vulnerable source code using machine learning. The AutoML model, trained on the dataset, achieved accuracy of 94% and F1-Score of 0.94 in binary classification, and accuracy of 94% and F1-Score of 0.93 in CWE-based multi-class classification. The LVDAndro dataset is publicly available, and continues to expand as more apps are scanned and added to the dataset regularly. The LVDAndro GitHub Repository also includes the source code for dataset generation, and model training.
LYZGen: A mechanism to generate leads from Generation Y and Z by analysing web and social media data
(Department of Industrial Management, Faculty of Science, University of Kelaniya Sri Lanka, 2021) Senanayake, Janaka; Pathirana, Nadeeka
Identifying an appropriate target audience is essential to market a product or a service. A proper mechanism should be followed to generate these potential leads and target audiences. The majority of people who were born between 1981 and 2012 hold top positions in companies. These people are regular social media and website users, since they represent generations Y and Z. They usually keep digital footprints. Therefore, if an accurate method is followed, it is possible to identify potential contact points by analysing publicly available data. In this research, a novel lead generation mechanism based on analysing social media and web data has been proposed and named LYZGen (Leads of Y and Z Generations). The input to the LYZGen model was an optimised search query based on the user requirement. The model used web crawling, named entity recognition (NER), and pattern identification. The model found and analysed freely available data from social media and other websites. Initially, person name identification was performed. An extensive search was carried out to retrieve peoples’ contact points such as email addresses, contact numbers, designations, based on the identified names. Cross verification of the analysed details was conducted as the next step. The results generator provided the final output, which contained the leads and details. Generated details were verified with responses captured via a survey and identified that the model could detect lead details with 87.3% average accuracy. The model used only the open data posted on the internet by the people. Therefore, it did not violate extensive privacy or security concerns. The generated results can be used, in several ways, including communicating promotional details to the potential target audience.