Impact of BERT Embedding on Deep Learning for Short Text Classification

Loading...
Thumbnail Image

Date

Journal Title

Journal ISSN

Volume Title

Publisher

Faculty of Graduate Studies, University of Kelaniya, Sri Lanka.

Abstract

The issues of short text classification have gained a considerable attention due to huge text data generations with social media platforms such as Facebook, twitter, Instagram etc. One of the primary tasks of these platforms is to correctly detect and classify data. But due to the shortness of text record length, short text classification is a challenging task. recent research works mostly utilize embedding based representation with neural network models to improve the classification performances of short text. BERT was one of the language models which was used to represent embedding features for user generated text dataset. But it's hard to find a comprehensive analysis of short text classification performance comparison with BERT embedding representation with different neural network models. Therefore, this research experiment was conducted to investigate the impact of basic neural models such as CNN, LSTM and RNN towards short text classification performances. Moreover, variation of those neural models such as BI-LSTM, CNN-LSTM were explored. Publicly available twitter dataset was selected for the experiment. After completion of the preprocessing, dataset was divided 70-30 ratio as training and validation. The training dataset was again divided with the ratio of 80:20 for neural model training and validation respectively. Our conclusions are that LSTM Neural model with 79% classification performance, outperformed RNN and CNN. The CNN model gave considerably low performance which is 23%. When variation of neural models was used for the investigation, Bidirectional LSTM gave 81.7% classification performance level compared to other models such as CNN-LSTM.

Description

Keywords

Citation

Jayakody, J. R. K. C., & Vidanagama, V. G. T. N. (2023). Impact of BERT embedding on deep learning for short text classification. International Postgraduate Research Conference (IPRC) - 2023. Faculty of Graduate Studies, University of Kelaniya, Sri Lanka. (p. 53).

Collections

Endorsement

Review

Supplemented By

Referenced By