Android comment classification using BERT

Keywords: Topic classification, text classification, natural language processing, BERT

Abstract

This project focuses on developing an NLP-based text analysis tool to evaluate Android app user feedback, specifically collected from F-Droid. The lack of an automated solution to analyze and understand these opinions, classifying them into specific topics, motivates research. The goal is to provide developers, users, and data analysts with a detailed view of user preferences and perceptions. Using data sets in English between 2014 and 2017, the proposal is implemented in Python with the Pandas library. The BERT model is used for classification, with a specific focus on the comparison of different models. The graphical interface is built in Visual Studio, allowing users to enter comments and obtain topic rankings, along with word cloud visualizations.

Downloads

Download data is not yet available.
References

. S. Moon, S. Chi, and S.-B. Im, “Automated detection of contractual risk clauses from construction specifications using bidirectional encoder representations from Transformers (Bert),” Automation in Construction, October, 2022. [Online]. Available:https://www.sciencedirect.com/science/article/pii/S0926580522003387.

. Alammar, J. "Ecco: An open source library for the explainability of transformer language models", Proceedings of the 59th annual meeting of the association for computational linguistics and the 11th international joint conference on natural language processing: System demonstrations, pp.249-257,2021.

. Rahat, A. Mohaimin, A. Kahir, and A. Mohammad. "Comparison of Naive Bayes and SVM Algorithm based on sentiment analysis using review dataset", 8th International Conference System Modeling and Advancement in Research Trends (SMART). IEEE, 2019.

. Li, Q., et al., “A survey on text classification: From traditional to deep learning”, ACM Transactions on Intelligent Systems and Technology (TIST), vol. 13, no.2, pp. 1-41,2022.

. G.Mendez, S., et al. “Identifying banking transaction descriptions via support vector machine short-text classification based on a specialized labelled corpus”, IEEE Access, vol. 8, pp. 61642-61655,2020.

. Nikhil, “Bert: Handling class imbalance in text classification,” Medium, December, 2023. [Online]. Available: https://medium.com/@nikviz/bert-handling-class-imbalance-in-language-models-7fe9ccc62cb6.

Received: 2023-11-09
Accepted: 2024-01-27
Published: 2024-03-30
How to Cite
[1]
S. R. E. Mansilla Ancco and M. A. Pérez Treviños, “Android comment classification using BERT”, Innov. softw., vol. 5, no. 1, pp. 94-110, Mar. 2024.
Section
Journal papers