Classification of news categories using BERT

Keywords: News classification, natural language processing, BERT, machine learning, artificial intelligence

Abstract

The present project consists of developing a Natural Language Processing model to classify news using a set of data or DataSets already evaluated. The main objective is to create a system that can automatically identify and assign news to one of the predefined categories: business, entertainment, politics, sports or technology. This involves data preprocessing, feature extraction, training a machinelearning model and then evaluating its performance using metrics such as "accuracy", "recall 2" F1 - score". This will allow to determine how well the model can predict the correct category for a new or unlabeled news item. If the performance of the model is satisfactory, it can be used to classify unlabeled news in real time. In summary, it seeks to provide an efficient and accurate solution for organizing and labeling the informative content of a news item with the help of Artificial Intelligence.

Downloads

Download data is not yet available.
References

Abu Nowshed Chy, Md Hanif Seddiqui, and Sowmitra Das. Bangla news classification using naive bayes classifier. In 16th Int’l Conf. Computer and Information Technology, pages 366–371. IEEE, 2014.

Philip J Hayes, Laura E Knecht, and Monica J Cellio. A news story categorization system. In Second Conference on Applied Natural Language Processing, pages 9–17, 2000.

Md Mahbubur Rahman, Rifat Sadik, and Al Amin Biswas. Bangla document classification using character level deep Lear Ning. In 2020 4th International Symposium on Multidisciplinary Studies and Innovative Technologies (ISMSIT), pages 1–6. IEEE, 2020.

Meng-Jin Wu, Tzu-Yuan Fu, Yao-Chung Chang, and Chia-Wei Lee. A study on natural language processing classified news. In 2020 Indo–Taiwan 2nd International Conference on Computing, Analytics and Networks (Indo-Taiwan ICAN), pages 244–247. IEEE, 2020.

Zhen Wang, Xu Shan, Xiangxie Zhang, and Jie Yang. N24news: A new dataset for multimodal news classification, 2022.

Received: 2023-03-18
Accepted: 2023-06-28
Published: 2023-09-30
How to Cite
[1]
B. L. Machado Medina, C. A. Santillana Quirita, and S. V. Bautista Luque, “Classification of news categories using BERT”, Innov. softw., vol. 4, no. 2, pp. 36-51, Sep. 2023.
Section
Journal papers