Application of Artificial Intelligence techniques for the differentiation of the socioeconomic level

Keywords: Artificial Intelligence, decision trees, logistic regression, dataset, socioeconomic status

Abstract

In this project, a differentiation is made between people through different parameters such as age, sex, educational level, among others, to try to calculate how much their salary could rise. This problem is important to solve because then a person could predict her future income through the decisions she would make in the present, such as how much education she should receive and when to start working to gain experience. Our procedure to solve this problem has been two statistical analyses, the first linear regression and a decision tree to be able to make a comparison between them, we have tested them using tools such as Colab (Python) and a dataset. Our population for our work was 32,000 records (rows). The results were that through the decision tree there was a precision of 0.88 and an accuracy of 0.82. And with respect to the logistic regression we obtained a precision of 0.80 when for the salary <=50K and 0.72 when the salary is >50K, the accuracy obtained is 0.7912. Concluding that between these two tools we are left with the Decision Tree.

Downloads

Download data is not yet available.
References

“Ingresos promedio a nivel mundial.” https://www.datosmundial.com/ingreso-promedio.php .

J. Vega, “Departamento de economía,” Pontif. Univ. Católica del Perú, p. 25, 2020, [Online]. Available: https://repositorio.pucp.edu.pe/index/handle/123456789/176236

J. Gamero and J. Pérez, “Perú: Impacto de la COVID - 19 en el empleo y los ingresos laborales,” Organ. Int. de Trab. Panor. Labor. en tiempos la COVID- 19, vol. I, no. I, p. 64, 2020, [Online]. Available: https://www.ilo.org/wcmsp5/groups/public/---americas/---ro-lima/documents/publication/wcms_756474.pdf

“Decreto Supremo N° 051-2020-PCM” https://cdn.www.gob.pe/uploads/document/file/572157/DECRETO_SUPREMO_N%C2%BA_051-2020-PCM.pdf (accessed Jun. 27, 2022).

“Decreto Supremo N° 116-2020-PCM” https://cdn.www.gob.pe/uploads/document/file/898487/DS_116-2020-PCM.pdf

Patrick Henry Winston, Artificial Intelligence, 3rd ed., vol. 110, no. 5. Addison-Wesley Publishing Company, 1993.

S. Fletcher and M. Z. Islam, “Decision tree classification with differential privacy: A survey,” ACM Comput. Surv., vol. 52, no. 4, 2019, doi: 10.1145/3337064.

S. Domínguez-Almendros, N. Benítez-Parejo, and A. R. Gonzalez-Ramirez, “Logistic regression models,” Allergol. Immunopathol. (Madr)., vol. 39, no. 5, pp. 295–305, 2011, doi: 10.1016/j.aller.2011.05.002.

D. B. Lomet, “Bulletin of the Technical Committee on Data Engineering,” Bull. Tech. Comm. Data Eng., vol. 24, no. 4, pp. 1–56, 2001, [Online]. Available: papers2://publication/uuid/30073F7F-1B7C-4496-ADA4-94FF4E6EE8F7

“Transformación de datos y por qué es importante para las empresas | Astera.” https://www.astera.com/es/type/blog/data-transformation-tools/

“ETL: Extracción, transformación y carga de datos - Evaluando Software.” https://www.evaluandosoftware.com/etl-extraccion-transformacion-carga-datos/

M. M. Breunig, H. P. Kriegel, R. T. Ng, and J. Sander, “OPTICS-OF: Identifying local outliers,” Lect. Notes Comput. Sci. (including Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinformatics), vol. 1704, pp. 262–270, 1999, doi: 10.1007/978-3-540-48247-5_28.

“Te damos la bienvenida a Colaboratory - Colaboratory.” https://colab.research.google.com/?hl=es#scrollTo=5fCEDCU_qrC0.

Received: 2023-12-19
Accepted: 2024-03-01
Published: 2024-03-30
How to Cite
[1]
C. Z. Pacori Paucar, M. E. Mayta Condori, L. F. Quispe Sanomamani, and D. G. Montana Neyra, “Application of Artificial Intelligence techniques for the differentiation of the socioeconomic level”, Innov. softw., vol. 5, no. 1, pp. 141-155, Mar. 2024.
Section
Journal papers