Linear Regression application to predict the popularity index in Spotify

Keywords: Python, Linear Regression, Predict

Abstract

Currently, streaming music services have become one of the main means of music consumption around the world. Spotify offers music streaming services and covers more than thirty million songs. Every year there is an increase in the production of songs so it is more difficult for a song to establish itself as a hit in the market. The objective of this work was to apply the Linear Regression modeling technique to find a trend of the data set on the popularity index of songs on the Spotify platform, in this way predict a result with new data that enters. A quantitative methodology was applied based on measurable data that were taken as datasets. As a result, a mean square error of 94.79 and a variance of 0.20 were obtained. The conclusion of the work is that the dataset used was not the ideal according to our objective.

Downloads

Download data is not yet available.
References

Gómez Herrero, R. (2021). Evolución de la Industria Musical. Siglo XX-Siglo XXI. UVaDOC Principal. https://uvadoc.uva.es/handle/10324/48012 [Accessed: June 22, 2022].

García Pizarro, A. (2021). El auge de la música en streaming. UVaDOC Principal. https://uvadoc.uva.es/handle/10324/51809 [Accessed: June 22, 2022].

D. Smite, N. B. Moe, G. Levinta and M. Floryan, "Spotify Guilds: How to Succeed With Knowledge Sharing in Large-Scale Agile Organizations," in IEEE Software, vol. 36, no. 2, pp. 51-57, March-April 2019, doi: 10.1109/MS.2018.2886178.

Interiano, M., Kazemi, K., Wang, L., Yang, J., Yu, Z., & Komarova, N. L. (2018). Musical trends and predictability of success in contemporary songs in and out of the top charts. Royal Society Open Science, 5(5), 171274. doi:10.1098/rsos.171274

M. M. Braga, "Spotify vs. Apple : a battle of titans", doctoral thesis, Universidade de Catolica Portuguesa, 2021.

M. Lopes Barata y P. Simões Coelho, "Music streaming services: understanding the drivers of customer purchase and intention to recommend", ScienceDirect, Volume 7, Issue 8, agosto de 2021, art. n.º e07783.

Golbaz, S., Nabizadeh, R., & Sajadi, H. S. (2019). Comparative study of predicting hospital solid waste generation using multiple linear regression and artificial intelligence. Journal of Environmental Health Science and Engineering, 17(1), 41-51.

Sravani, B., & Bala, M. M. (2020, June). Prediction of student performance using linear regression. In 2020 International Conference for Emerging Technology (INCET) (pp. 1-5). IEEE.

Hernández Oliván, C., & Beltrán Blázquez, J. R. Análisis musical mediante inteligencia artificial.

C. Qin, H. Yang, W. Liu, S. Ding and Y. Geng, "Music Genre Trend Prediction Based on Spatial-Temporal Music Influence and Euclidean Similarity," 2021 36th Youth Academic Annual Conference of Chinese Association of Automation (YAC), 2021, pp. 406-411, doi: 10.1109/YAC53711.2021.9486510.

López Takeyas, B. (2007). Introducción a la inteligencia artificial. Instituto Tecnológico de Nuevo Laredo. http://itnuevolaredo.edu.mx/takeyas/Articulos/Inteligencia%20Artificial/ARTICULO%20Introduccion%20a%20la%20Inteligencia%20Artificial.pdf

Contributors to Wikimedia projects. (2009, 29 de julio). Data reduction - Wikipedia. Wikipedia, the free encyclopedia. https://en.wikipedia.org/wiki/Data_reduction

Contributors to Wikimedia projects. (2005, 31 de diciembre). Data cleansing - Wikipedia. Wikipedia, the free encyclopedia. https://en.wikipedia.org/wiki/Data_cleansing

Alonso, J. C., & Muñoz, A. (2014). Interpretacion de variables Dummy en modelos log-lin. Cali, Colombia: Departamento de Economía, Universidad Icesi.

Variable ficticia - Definición, qué es y concepto | Economipedia. Economipedia. https://economipedia.com/definiciones/variable-ficticia.html

¿Qué es la regresión lineal? MathWorks - Creadores de MATLAB y Simulink - MATLAB y Simulink - MATLAB & Simulink. https://la.mathworks.com/discovery/linear-regression.html (accedido el 13 de agosto de 2022).

M. R. Gupta, E. K. Garcia and E. Chin, "Adaptive Local Linear Regression With Application to Printer Color Management," in IEEE Transactions on Image Processing, vol. 17, no. 6, pp. 936-945, June 2008, doi: 10.1109/TIP.2008.922429.

Colaboradores de los proyectos Wikimedia. (2002, 13 de febrero). Python - Wikipedia, la enciclopedia libre. Wikipedia, la enciclopedia libre. https://es.wikipedia.org/wiki/Python

¿Qué es Google Colaboratory? (s. f.). 330ohms. https://blog.330ohms.com/2021/08/10/que-es-google-colaboratory/

Colaboradores de los proyectos Wikimedia. (2012, 4 de febrero). NumPy - Wikipedia, la enciclopedia libre. Wikipedia, la enciclopedia libre. https://es.wikipedia.org/wiki/NumPy

Introducción a la Librería Pandas de Python. (s. f.). Aprende IA. https://aprendeia.com/introduccion-a-la-libreria-pandas-de-python-parte-1/

Matplotlib: Funciones principales. Cursos de Programación de 0 a Experto © Garantizados. https://unipython.com/matplotlib-funciones-principales/

Espacio de recursos de ciencia de datos. (s. f.). Espai de recursos de ciencia de dades. http://datascience.recursos.uoc.edu/es/preprocesamiento-de-datos-con-sklearn/

Seaborn presentación. (s. f.). Interactive Chaos. https://interactivechaos.com/es/manual/tutorial-de-seaborn/presentacion

Gao, J. (2012). Data preprocessing.

Received: 2023-05-07
Accepted: 2023-08-15
Published: 2023-09-30
How to Cite
[1]
C. Vasquez Alvarez, E. Coaquira Cuevas, E. Mendoza Hilasaca, and J. Pinto Ñaupa, “Linear Regression application to predict the popularity index in Spotify”, Innov. softw., vol. 4, no. 2, pp. 121-135, Sep. 2023.
Section
Journal papers