Smart Computing and Systems Engineering - 2022 (SCSE 2022)
Permanent URI for this collectionhttp://repository.kln.ac.lk/handle/123456789/25392
Browse
Item Extraction of Sentiments in Tamil Sentences Using Deep Learning(Department of Industrial Management, Faculty of Science, University of Kelaniya Sri Lanka, 2022) Loganathan, Hirushayini; Sakuntharaj, RatnasingamSentiment analysis is the process of extracting information from the given text in which the text consists of various sensations such as happiness, perturbation, pride, worry, and so on about various functions, human beings, systems, and facts. Sentimental analysis or opinion mining uses data mining and natural language processing techniques to discover, retrieve and filter the information and opinions from the World Wide Web’s vast textual information. The sentiment analysers for European languages and some Indic languages are fully developed. However, Tamil, which is an under-resourced language with rich morphology, has not experienced these advancements. A few experiments have been conducted to determine the sentiments for Tamil text. An approach to doing the sentiment analysis for the Tamil language is proposed in this paper. The proposed approach uses Long Short-Term Memory, Convolutional Neural networks, and simple Deep Neural Network techniques. Test results show that the Long Short-Term Memory-based deep learning model performs well than the Convolutional Neural Network and simple Deep Neural Network for sentiment analysis of Tamil language with 94.10% accuracy.Item Machine Learning Approach to Predict Mental Distress of IT Workforce in Remote(Department of Industrial Management, Faculty of Science, University of Kelaniya Sri Lanka, 2022) Gamage, Sanduni Nilushika; Asanka, P. P. G. DineshWhen considering online workers, due to the emergence of the coronavirus pandemic prevailing in the world, employees have been restricted to work remotely for a prolonged period. All the working arrangements are now based at home than before. Since this has been novel to society, the impact caused by this crisis on people is unknown in the short or long term. Since various factors can cause mental distress among online workers, periodic screening for mental distresses such as anxiety, depression, and stress is necessary for health and well-being. The causes of mental distress are multifactorial. They include socio-demographic, biological, economic, environmental, occupational, and psychological aspects. This paper proposes a concept of a screening system to predict mental distress given the external features associated with individuals, using supervised machine learning approaches and identifying the employees prone to higher risk and referring them early to professional assistance. The study was conducted concerning the circumstances in a pandemic era considering COVID-19 as the case study. The study was done with remote IT workers in Sri Lanka who work as a part of a software development team. 481 professionals participated in the study and were selected based on selection criteria and appropriate encoding techniques were utilized to encode categorical variables where most important 25 features were detected among 60 features using feature selection. Finally, classification techniques such as Random Forest, SVM, XGBoost, CatBoost, decision tree, and Naïve Bayes were used for modeling by which the CatBoost algorithm in overall measures outperformed other algorithms with a predictive accuracy of 97.1%, precision of 97.4%, recall of 99.7%, and f1 measure is 98.5%.Item A Model to Optimize the Sales and Purchases Invoice Payments of Working Capital in the Fast-Moving Consumer Goods Industry(Department of Industrial Management, Faculty of Science, University of Kelaniya Sri Lanka, 2022) Rathnasekara, J. P. D. T.; Wijayanayake, A. N.; Withanaarachchi, A.Working capital optimization is critical in real business scenarios since it changes dynamically along with complex physical cash flows. In previous literature, working capital payment optimization mainly focused on the cash conversion cycle and cash on hand. In those studies, the objectives were to maximize the profit, maximize on hand cash flow or minimize the cost during the predefined period. However, in most cash maximization models, the time value of the money concept was not addressed. Further, in real-world scenarios, the time value of the money concept mainly affects the working capital and cash flow performances. In the proposed model, the time value of money concept was considered to get actual available cash at present. The objective of this proposed model is to maximize the current value of the money on hand while minimizing the cost within the considered time frame. The model was tested using Python along with CPLEX libraries. This study will be helpful to researchers, academics, and those working in the finance sector of the manufacturing industry to make better decisions on working capital invoice payments.Item Deep Neural Architectures for Ethnicity Classification in Face Images(Department of Industrial Management, Faculty of Science, University of Kelaniya Sri Lanka, 2022) Jayasekara, Buddhi G.; Hevapathige, AselaEthnicity is a key metric of an individual’s identity, social cluster, physical behaviour and cultural association. Accurate ethnicity identification of humans is required in numerous fields like security, legislation, social analysis and psychology. Ethnicity classification using machine learning is a complex, non-trivial and multi-dimensional research problem due to the feature complexity, class imbalance and the absence of rich data sets. In this research study, we have trained and compared four state-of-the-art deep neural models and their ensemble architectures on the problem of ethnicity classification in large-scale image data. The empirical results demonstrate that these end-to-end deep learning models and their ensemble architectures perform well in learning complex ethnic features in facial images and classifying them. From the evaluated models, Ensemble Convolutional Neural network provided the highest classification performance with 78.9% accuracy. Also, we have tested six prominent pre-trained models using transfer learning for ethnicity classification while being able to achieve comparable results.Item An Emotion Classification Model for Driver Emotion Recognition Using Electroencephalography (EEG)(Department of Industrial Management, Faculty of Science, University of Kelaniya Sri Lanka, 2022) Gamage, T. A.; Kalansooriya, L. P.; Sandamali, E. R. C.Road accidents have been a critical issue that has resulted in fatal injuries, disabilities, and deaths for many individuals worldwide. The notion of Human-Computer Interaction (HCI) is widely considered in monitoring drivers to safeguard their lives on roads. As a solution to the issue of the higher rate of road accidents, driver emotion recognition approaches have gained much attention, and the involvement of biological signals in detecting the emotional states of drivers is also significant. The authors have conducted a comprehensive literature review that concerns contemporary literature on the driver emotion recognition paradigm and comes up with four emotional states in this research to monitor the drivers' affective states. This paper presents a novel approach to detecting sad, angry, fearful, and calm emotional states of drivers with an emotion classification model using Electroencephalography (EEG) signals where the EEG data acquisition for the research is done using the Emotiv EPOC X device. The collected EEG data are preprocessed using the EEGLAB toolbox in Matlab, and feature extraction, selection, and emotion classification model training are done using Matlab. EEG acquisition and preprocessing have already been achieved, and as further work, the authors are to train the proposed emotion classification model as laid out in this paper. The findings of this research encourage the authors to continue towards the completion and provide further insights into enhancing research in the driver emotion recognition paradigm.Item Mapping of Sri Lankan Road Signs by Using Google Street View Images(Department of Industrial Management, Faculty of Science, University of Kelaniya Sri Lanka, 2022) Kiridana, Y. M. W. H. M. R. P. J. R. B.; Weerarathna, P. L. M.; Wijesingha, W. P. D. Y.; Aashiq, M. N. M.; Kumara, W. G. C. W.; Haleem, M. A. L. A.The development of autonomous vehicle driving systems and Intelligent Transportation System (ITS) have drawn massive attention since the 1980s. For the development of ITS, road sign detection and identification are considered to be very important due to the vital information provided by road signs. Generally, real-time video-based methods are used as the source of images for the operation of ITS. But they are inefficient and costly due to certain limitations like weather conditions, lighting conditions, and limited range in obtaining quality images. To overcome the limitations of the video-based approach, this research aims to develop techniques for detecting and identifying road signs by using Google Street View (GSV) as the image source, OpenCV for image processing and CNN for road sign identification. EdleNet, LeNet-5, and DenseNet were identified as accurate CNN models. Using images from GSV, generating a database of road signs with the relevant coordinates was possible, which is currently unavailable in Sri Lanka. In addition, this process leads to the generation of a valuable image dataset of Sri Lankan road sign images, and a web interface with mapped road signs. Consequently, this research would yield useful findings that may be applied to future research and provide the means to develop ITS, accident-avoidance systems, and driver assistance systems.Item Evaluating the Factors that Affect the Reverse Logistics Performance in Plastic Supply Chain(Department of Industrial Management, Faculty of Science, University of Kelaniya Sri Lanka, 2022) Thilakarathne, H. G. K. L. S.; Wijayanayake, A. N.; Peter, S.Reverse Logistics includes all the processes involved in moving goods from their typical final destination to recapture value or for proper disposal. This study aims to identify the factors affecting reverse logistics performance within the plastic supply chains in Sri Lanka and the identification of factors that would facilitate the enhancement of reverse logistics performance. Factors were identified through literature review and by industry experts. Factors were classified under five domains: economic and market factors, knowledge and awareness factors, policy and legislature factors, management and leadership factors, and technology and infrastructure factors. Factors were analyzed using the Partial Least Squares, Structural Equation Modelling (PLS-SEM) technique to evaluate their effect on reverse logistics performance. Results of the analysis show that the policy and legislature factors are the only significant factor to affect reverse logistic performance in plastic supply chains in Sri Lanka. All the remaining factors do not show a significant relationship with the reverse logistics performance though they show a positive correlation with the performance.Item Real-Time Exam Anomaly Detection in Moodle-based Exam Systems with an AI Agent(Department of Industrial Management, Faculty of Science, University of Kelaniya Sri Lanka, 2022) Akalanka, P. D. A. U.; Manathunga, KalpaniOnline education takes a high priority in the modern world because technology is evolving so rapidly that education needs to adapt to this changing and evolving technology. However, after the COVID-19 pandemic, e-learning is the only available solution to continue teaching during the lockdown periods. The evolution of these studies also needs to adapt to the situation. One of the significant issues with this online evaluation method is the anomalies during the evaluation process. This proposed implementation mainly focuses on anomaly detection of the Moodle environment exam systems. The proposed system produces a Moodle plugin to detect the time taken for each question in the Moodle environment examination system and detect the exam anomalies using the time variations. Then analyze and calculate the time that each candidate has taken for each question and the average time. The invigilator can see the candidates who took more than average time and less than average time and get the suspicious candidate list. The plugin also contains a separate algorithm that monitors the candidate while facing the exam. This face detection algorithm will notice the unusual behaviors of the candidate and upload the created report to the database, and the invigilator can access these reports on their loggings. To guide the candidate system, they also have an AI agent who will help to understand the exam process, give pre-defined answers for the questions, and provide contact details of the relevant authorities for exceptional cases. Also, the developed plugin detects the system information and background apps that run during the exam process and automatically creates relevant reports, and uploads them into the database. After the system implementation, the system was tested using a selected audience. The developed application is an innovative initiative to support the Moodle-based examination process.Item Simulation-Based Efficiency Assessment of Integrated First-Mile Pickup and Last-Mile Delivery in an E-Commerce Logistics Network(Department of Industrial Management, Faculty of Science, University of Kelaniya Sri Lanka, 2022) Ranathunga, M. I. D.; Wijayanayake, A. N.; Niwunhella, D. H. H.Logistics operations are crucial in the e-commerce supply chain as they deal with high costs as well as they have a significant environmental impact. The first-mile and last-mile delivery operations in e-commerce logistics are regarded as the operations with the highest costs. As a result, e-commerce service providers are keen to improve their first mile and last-mile delivery processes. Therefore, this study has been conducted to optimize transportation cost and distance of combined first-mile pickup and last-mile delivery operations while meeting some practical requirements such as a variety of package types, package compatibility on different types of vehicles, and a heterogeneous fleet of vehicles. After a careful literature review, this paper introduces a mathematical model to optimize the simultaneous first-mile pickup and last-mile delivery. The proposed mathematical model was simulated in SupplyChainGuru® modelling and simulation software. The study concluded that when first-mile pickup and last-mile delivery are routed simultaneously, rather than distributing and collecting the products as two separate operations, the overall cost is minimized by about 34%, and the distance is reduced by about 42% while reducing the number of vehicles on the route and utilising the maximum possible capacity of vehicles. This simultaneous pickup and delivery will improve the routing of the e-commerce logistic supply chain and will serve as a platform for extending the simultaneous pickup and delivery process to other industries as well.Item Evaluating the Project Management Methodology Adoption Process in the Sri Lankan Software Industry(Department of Industrial Management, Faculty of Science, University of Kelaniya Sri Lanka, 2022) Koggalahewa, Lahiru; Abeysekara, RuwanProjects are the primary method of implementing change in an organization. Most software firms tend to use a Project Management Methodology (PMM) to deliver the product to the customer. However, there is no rational set of motives to select the desired PMM. Given this context, this study investigates the reasons for the adoption of PMM rather than using an internally defined set of processes. The study also evaluated the factors affecting the selection and implementation of the PMM. The study used multiple case study methods, and data were gathered by conducting six in-depth interviews with experienced management personnel in software organizations. The analysis concludes with a list of PMM adoption motives and how PMM selection, adoption and implementation are carried out in the software industry domain from a strategic point of view. The study investigated organizational transitions and change management from a complex managerial perspective, as well as cultural dimensions and individual and organizational social values. The study fills a gap in the literature. Also, the study derives a proper processual plan for the adoption motives, selection and adoption process of PMM, which should help inform future strategies in this field.Item Examine the Impact of IoT for Supply Chain-Based Operations in ERP Systems: Systematic Literature Review(Department of Industrial Management, Faculty of Science, University of Kelaniya Sri Lanka, 2022) Paththinige, Pavani; Thilakarathne, Kasun; Rathnasekara, Tenisha; Wickramaarachch, Ruwan; Withanaarachchi, A.This study intends to look into the various ways that Industry 4.0 elevates the capabilities of ERP systems. Over recent years, it is evident that the Industry 4.0 concept has been the birthplace of many innovative technologies across various industries, and the practical implementation of Industry 4.0related technologies is also rapidly expanding across many industries. Moreover, the incorporation of Industry 4.0-related technologies has dramatically increased the performance of organizations and hence led to sustained profits in the long term. By integrating Industry 4.0 technologies such as IoT and RFID into ERP systems, the ERP systems will be able to capture realtime data from the work floor level, enabling ERP systems to provide more accurate analytics and predictions. Therefore, upgrading the existing traditional ERP systems to support these novel technologies presented by Industry 4.0 will undoubtfully contribute to enhancing the capabilities of ERP systems. Among the limited studies investigating ERP systems from the Industry 4.0 perspective, none have focused on conducting a systematic literature review on ERP from the Industry 4.0 standpoint to investigate how Industry 4.0 enhances the capabilities of ERP systems. Therefore, this study contributes to the theory by fulfilling that knowledge gap.Item Identifying and Prioritizing Barriers to Adopting GSCM Practices in Sri Lankan Rubber Product Manufacturing Industry(Department of Industrial Management, Faculty of Science, University of Kelaniya Sri Lanka, 2022) Peiris, M. P. P. L.; Kavirathna, C. A.; Wijayanayake, A. N.The rubber industry is one of the leading industries contributing to the Sri Lankan economy, but it is considered an industry with high environmental pollution potential. The competition and customer pressure coming from the international market have further highlighted the need for GSCM (Green Supply Chain Management). Therefore, focusing on GSCM is crucial but, the RPMI (Rubber Product Manufacturing Industry) in Sri Lanka is struggling to adopt GSCM. This study was conducted to identify and prioritize barriers to GSCM adoption in Sri Lankan RPMI and to find out the inter-relationships among barriers and GSCM practices. Through the literature review and expert opinions, eight barriers and five GSCM practices were identified and finalized. DEMATEL (Decision-Making Trial and Evaluation Laboratory) was used to find cause and effect relationships and interrelationships among factors related to each GSCM practice. Findings showed that major existing barriers and their significant interrelationships differed depending on the GSCM practice. Lack of knowledge, skills and expertise, lack of information flow and lack of technology are the major existing barriers to adopting GSCM practices in Sri Lankan RPMI. The findings may assist managers in making better long-term decisions to successfully adopt GSCM practices in Sri Lankan RPMI.Item Automated Spelling Checker And Grammatical Error Detection And Correction Model for Sinhala Language(Department of Industrial Management, Faculty of Science, University of Kelaniya Sri Lanka, 2022) Goonawardena, Mithma; Kulatunga, Ashini; Wickramasinghe, Raveena; Weerasekara, Thisuraka; De Silva, Hansi; Thelijjagoda, SamanthaSinhala is a native language spoken by the Sinhalese people, the largest ethnic group in Sri Lanka. It is a morphologically rich language, which is a derivation of Pali and Sanskrit. The Sinhala language creates a diglossia situation, as the language’s written form differs from its spoken form. With this difference, the written form requires more complex rules to be followed when in use. Manually proofreading the content of Sinhala material takes up much time and labor, and it can be a tedious task. Hence, a system is necessary which can be used by different industries such as journalism and even students. At present, there are a handful of systems and research that have automated Sinhala spelling analysis and grammar analysis. In addition, the existing systems are mainly focused on either spelling analysis or grammar analysis. However, the proposed system will cover both aspects and improve upon existing work by either optimizing or re-building the process to provide accurate outputs. The proposed system consists of a suffix list built for verbs and subjects, which helps the system stand out from the current proposed solutions. This research intends to implement a service for spell checking and grammar correctness of formal context in Sinhala. The research follows a rule-based approach with some components adopting a hybrid approach. As per the literature survey, many papers were analyzed, related to different aspects of the proposed system and complete systems. The proposed system would be able to overcome most barriers faced by previous papers whilst it takes a fresh take on providing a solution.Item A Systematic Investigation on the Effectiveness of the Tabbert Model for Credit Card Fraud Detection(Department of Industrial Management, Faculty of Science, University of Kelaniya Sri Lanka, 2022) Hewapathirana, Isuru; Kekayan, Nanthakumar; Diyasena, DeshanjaliAs a result of rapid digitisation, online transactions using credit cards have become popular. With this, fraudulent activities have also increased considerably. Although many supervised and unsupervised machine learning techniques were proposed in past research for identifying fraudulent transactions, they do not fully utilize the tabular and hierarchical structure present in transaction datasets. Recently, the TabBERT neural network model was proposed to calculate row-wise embeddings that capture both inter and intra dependencies between transactions in tabular time series data. In this research, we present a systematic experimental framework to assess the effectiveness of applying the embeddings calculated using the TabBERT model for credit card fraud detection. We employ the calculated row embeddings for fraud detection using three unsupervised machine learning algorithms and two supervised machine learning algorithms. We perform our experiments on a synthetic dataset that has been generated using the TabGPT model. Overall, TabBERT-based embeddings increase the performance of the supervised learning models with the extreme gradient boosting model achieving a precision of 99% and an F1 score of 98%, and the multilayer neural network model achieving a precision of 97% and an F1 score of 95%. For unsupervised learning, the use of TabBERT embeddings increases the recall rate of K-means clustering algorithm by 0.19%.Item Designing of a Voice-Based Programming IDE for Source Code Generation: A Machine Learning Approach(Department of Industrial Management, Faculty of Science, University of Kelaniya Sri Lanka, 2022) Nizzad, A.R. M.; Thelijjagoda, SamanthaHumans are precise in recognizing natural languages and responding contextually unlike machines. However, speech recognition or Automatic speech recognition often refers to converting human speech or voice to textual information with the help of artificial intelligence algorithms. With the advancement of Artificial Intelligence technologies and extensive research being conducted in AI, speech recognition has received much attention and has emerged as a subset of Natural Language Processing where the advancement and accuracy in speech recognition will open many ways to provide a high standard of human-computer interaction. In this study, using the pre-trained transformer model with a transfer learning approach, the English to Python dataset was used to train the transformer model to produce syntactically correct source code in python. Additionally, the Word2Vec model was used to generate voice-to-text as input for the model. For the purpose of demonstration, a custom Python IDE is developed to generate source code from voice input. The results and findings suggest that in the transformer model, with the use of transfer learning, any dataset can be trained to produce syntactically correct source code. The model’s training loss and validation loss were below 5 and 2.1, respectively. Future research can focus on generating valid source code from any human spoken language without restricting it to English only.Item Factors Influencing the Secondary Level Students’ Satisfaction in E-Learning: A Case Study of an Educational Institute in Sri Lanka(Department of Industrial Management, Faculty of Science, University of Kelaniya Sri Lanka, 2022) Jayanett, W. I.; Jayalal, ShanthaWith the covid-19 pandemic, e-learning has shown significant growth in Sri Lanka over the last few years. As a remedy to sudden school closure during the covid-19 outbreak, educational institutes have adopted e-learning to minimize the disruption of education. Even though there are benefits, teachers complained that the satisfaction of secondary level students is declining, and it has impacted the academic performance to become low. Therefore, this research is conducted to investigate the factors influencing the secondary level students’ satisfaction in e-learning at an educational institute in Sri Lanka from students’ perspectives. This study takes 211 students from secondary-level students in an educational institute as participants. The data were gathered through online questionnaires undertaking a Quantitative approach. Overall results indicate that flexibility is the most influencing factor. Respectively, the quality of the e-learning system/platform, Interactivity, quality of the Internet, and quality of the learning material influence students’ satisfaction. As per the recommendations, the educational institute is suggested to select a suitable e-learning platform and use Learning Management System (LMS). Also, they are suggested to provide a fixed timetable for teachers. The teachers are encouraged to be more interactive and to use computer-based learning materials to deliver the content. Also, an educational institute is suggested to provide adequate teacher training in creating resource materials. The Ministry of Education is suggested to provide a free e-learning system and data package for less cost. Also, the Ministry of Education is recommended to take strategic decisions to enhance school curriculums to be interactive. E-learning system designers should be aware of the school curriculum in designing e-learning systems. And the Government is encouraged to increase the coverage and infrastructure facilities to establish a satisfying e-learning environment.Item An Ensemble Machine Learning Approach for Stroke Prediction(Department of Industrial Management, Faculty of Science, University of Kelaniya Sri Lanka, 2022) Premisha, P.; Prasanth, Senthan; Kanagarathnam, Mauran; Banujan, KuhaneswaranNowadays, one out of four people above 25 will suffer from a stroke. Especially this year, with the highest count of around 13.7 million people discovered with stroke for the first time. Out of 13.7 million, 5.5 million were fatalities. This was stated in a recent WHO study. It is estimated that if no action is taken, the number of fatalities will rise to 6.7 million yearly. The pandemic situation of COVID-19 will play a significant cause in the expanded death rate of stroke. Even for adults and patients with minor risk factors affected by stroke rather than in previous years. This study predicts the impact level of stroke with the development of an ensemble model by combining the various classifiers performed well in isolation. Predicting the stroke status in patients would help the physicians determine the prognosis and assist them in providing the targeted therapy in a limited time. During this study, an ensemble model was built by considering the base, bagging, and boosting classifiers: Support Vector Machine, Naïve Bayes, Decision Tree, Logistic Regression, Artificial Neural Network, Random Forest, XGBoost, LightGBM, and CatBoost. The dataset consists of 5110 patient details, along with 12 attributes that were analyzed in this research. The final ensemble model was developed by carrying out the methodology in two phases. During the first and second phases, the classifiers mentioned above were trained without hyper-parameter tuning and with hyperparameter tuning and tested against the fundamental evaluation matrices. During each phase, the classifier that produces the highest classification accuracy is discovered from the base, bagging, and boosting categories. From the results obtained, the final ensemble model was constructed using the Max Voting approach, which yielded an accuracy of 95.76%.Item Common Object Request Broker-based Publisher-Subscriber Middleware for Internet of Things - Edge Computing(Department of Industrial Management, Faculty of Science, University of Kelaniya Sri Lanka, 2022) Perera, Hansani; Jayakody, AnuradhaThe edge computing layer in IoT reduces the flow of a massive amount of data directly to the cloud by processing some data in the local network. The middleware in the layer enables this processing of data and the communication between heterogeneous devices and services in the nearby layers. CORBA, which uses as a powerful middleware technology in developing middleware solutions in enterprise-level distributed applications, has been abandoned in the current generation. The paper presents the design, and the performance evaluation of a publisher-subscriber middleware implemented using CORBA that was studied when exploring the applicability of CORBA as an IoT edge computing middleware. The evaluation was continued in two steps to analyse several parallel connections (Load test) and handle requests in a unit time (burst test) via simulating an IoT environment in a cloud environment.Item Stock Market Prediction using Artificial Intelligence(Department of Industrial Management, Faculty of Science, University of Kelaniya Sri Lanka, 2022) Dilhan, M. W. Sachin; Wagarachchi, N. MihiriniThis research focuses on predicting stock closing prices for one day or the future in specific economic conditions. Today, Sri Lanka faces a financial crisis due to the COVID-19 pandemic. Therefore, lots of investors are bankrupt due to unpredictable stock prices. This work mainly focuses on predicting stock prices in banking sector shares such as Commercial Bank (COMB.N), Hatton National Bank (HNB.N), Seylan Bank (SEYB.N), and Sampath Bank (SAMP.N) on Colombo Stock Exchange (CSE) in Sri Lanka. According to the hypothesis, All Share Price Index (ASPI) and Banking Sector indices have been taken as a numerical sentiment parameter other than the historical prices from each bank. Since ASPI shows overall market performance and Banking sector indices show banking sector capitalization changed over time. There can be a positive and negative sentiment when the ASPI and Sector Indices increase and decrease, respectively. Finally, a dataset is divided into 70% for training and 30% for testing. This study has used Recurrent Neural Networks (RNNs) such as Long short-term memory (LSTM) and Gated Recurrent Unit (GRU) using 25, 50, 100, 150, and 200 epochs. LSTM model has given the lowest Mean Squared Error (MSE) and Root Mean Square Error (RMSE). According to the LSTM model, COMB.N, HNB.N, and SAMP.N were given the lowest MSE, and RMSE for 100 epochs, and SEYB.N was given the lowest MSE and RMSE value for the 150 epochs.Item A Comparative Study of Clustering English News Articles Using Clustering Algorithms(Department of Industrial Management, Faculty of Science, University of Kelaniya Sri Lanka, 2022) Disayiram, N.; Rupasingha, R. A. H. M.The news informs us of what is going on in the world. People nowadays read their interesting news on news websites. There are numerous categories of news. Each newsreader has a different preference for news categories. Sportspeople prioritize sports news, whereas technology fans pay attention to the technology segment of the news. At the end of the day, each news category is important. Every day, a large amount of information is released on news websites. News sites usually categorize the news however, not all of the categories are published on those sites. Some categories are given higher attention by news outlets, while others receive less coverage. As a result, finding an appropriate category of news is tough. These issues make it difficult for newsreaders and content seekers to find relevant sections on news websites. The clustering of English news articles by relative category provides solutions to these issues. This research aims to use clustering algorithms to cluster news articles depending on the relevant domain/cluster. We consider five news categories: politics, sports, health, technology, and business. The data collected online was converted into a vector format using the term frequency-inverse document frequency (TF-IDF) vectorization. Then, on the body of the news and the news heading, the three clustering algorithms: Expectation-Maximization (EM), Simple K-means, and Hierarchical Clustering based on an agglomerative approach were applied individually. The Waikato Environment for Knowledge Analysis (WEKA) tool's classes to clusters evaluation model are used to calculate the accuracy. The EM method had the maximum accuracy of 88.5% with the best results in terms of correctly clustered instances. The comparison between the heading of news and the body of news demonstrates that the body of news clustered the news items better than the heading of news.
- «
- 1 (current)
- 2
- 3
- »