Fake News Detection Using Text Mining and Web Scraping
Loading...
Date
Journal Title
Journal ISSN
Volume Title
Publisher
Center for Data Science, University of Colombo, Sri Lanka.
Abstract
Dissemination of fake news has become a vital threat to the lifestyle of people. This paper presents a comprehensive method for detecting fake news in digital information using a combination of deep learning and machine learning techniques. It includes deep learning models such as Bi-directional Encoder Representations for Transformers (BERT) and a recurrent neural network (RNN) with Global Vectors for Word Representation (GloVe) embeddings, as well as machine learning models including Extreme Gradient Boosting (XGBoost), Naive Bayes classifier, support vector machines, and random forests. The study encompasses the usage of various semantic features of text, such as Term frequency (TF), Term-Frequency-Inverse Document Frequency (TF-IDF), N-gram models, and GloVe embeddings. Unlike prior studies that depend completely on semantic embeddings, this study combines explicit linguistic and psychological features with semantic representations to enhance fake news detection accuracy by capturing text explicit features. The study was conducted on a balanced Kaggle dataset of 6,335 articles and achieved 86% accuracy using the RNN with GloVe embeddings, outperforming baseline models, and contributing a robust hybrid framework applicable to real-time fake news detection.
Description
Citation
Kavinda, M. M. M. T., & Rajapaksha, R. R. L. U. I. (2025). Fake news detection using text mining and web scraping. Proceedings of the 3rd International Conference in Data Science 2025. Center for Data Science, University of Colombo, Sri Lanka. (p. 32).