A Sentiment Analysis of COVID-19 Tweets Data Using Different Word Embedding Techniques

dc.contributor.authorNawarathne, U.M.M.P.K.
dc.contributor.authorKumari, H.M.N.S.
dc.date.accessioned2024-01-16T04:42:55Z
dc.date.available2024-01-16T04:42:55Z
dc.date.issued2023
dc.description.abstractThe COVID-19 virus that invaded the world in 2019 caused many casualties while creating enormous mental turmoil among humans. During this pandemic period, humans were confined to prevent the virus from spreading. Due to the isolation, people used social media platforms like Twitter to express their ideas. Therefore, this study analyzed tweets related to COVID-19. Initially, text data processing techniques were employed, and sentiment labels were assigned. Then the data were trained using different machine learning (ML) models such as Multinomial Naïve Bayes (MNB), Random Forest (RF), Support Vector Machine (SVM), Decision Tree (DT), k-Nearest Neighbours (KNN), Logistic Regression (LR), Extreme Gradient Boosting (XGB), and CatBoost (CB). During the training phase, word embedding techniques such as Bag-of-Words (BoW), Term Frequency-Inverse Document Frequency (TF-IDF), Word2Vec, Global Vectors for Word Representation (Glove), Bidirectional Encoder Representations from Transformers (BERT), and Robustly Optimized BERT-Pretraining Approach (RoBERTa) were used, and evaluation metrics such as accuracy, macro average precision, macro average recall, and macro average f1-score were calculated to evaluate these models. According to the results, the CB model, which used the RoBERTa technique, achieved an accuracy of 97%. Therefore, it can be concluded that CB with RoBERTa provides better results when classifying tweet data.en_US
dc.identifier.citationNawarathne U.M.M.P.K.; Kumari H.M.N.S. (2023), A Sentiment Analysis of COVID-19 Tweets Data Using Different Word Embedding Techniques, International Research Conference on Smart Computing and Systems Engineering (SCSE 2023), Department of Industrial Management, Faculty of Science, University of Kelaniya Sri Lanka. Page 15en_US
dc.identifier.urihttp://repository.kln.ac.lk/handle/123456789/27353
dc.publisherDepartment of Industrial Management, Faculty of Science, University of Kelaniya Sri Lankaen_US
dc.subjectclassification, machine learning, sentiment analysis, word embeddingsen_US
dc.titleA Sentiment Analysis of COVID-19 Tweets Data Using Different Word Embedding Techniquesen_US

Files

Original bundle

Now showing 1 - 1 of 1
Thumbnail Image
Name:
Proceeding SCSE 2023 (3) 15.pdf
Size:
11.48 KB
Format:
Adobe Portable Document Format
Description:

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed upon to submission
Description: