First of all, we randomly selected 200 samples (one limitation that is the release date in 1999 or later)from the dataset - Spotify Dataset 1921-2020, 600k+ Tracks) .
And, what we did is to download those songs manually and analyze the Mel Spectrogram, than, tranfer them to 100-dimensional vectors. Later we used the vectors to train three models to predict the "popularity" defined in the dataset. However, due to the small amount of data, the regression models do not perform well. Next, we define a variable
Than we used the new data to train three binary classification models - Sequential neural network, DTree and Random forest. And we got some results.
- We tried to use PCA to reduce dimensionality, but the effect was not good, so we won’t describe it further.