wsdm-cup-2018-music

Publication of the code we used in the WSDM'18 Cup Music Recommendation Challenge.

Usage

Place data files in the data/ and data_common/ directory
Run the python scripts in the following order
- split.py (Optional, creates a local sample)
- prep_songs.py (Some extra data preparation and combination, requires the cleaning of songs.csv as songs_clean.csv)
- prep.py (Some dataset preparations)
- als.py (Creates the latent features)
- cluster.py (Clusters the latent features for users, tracks, artists, and combinations of those)
- create_featureset.py (Combines all data in one dataset to work on)
- lgbm_test.py (Tests a single LGBM model)
- lgbm_cv.py (Creates the n-fold trained ensemble)
The folder to do the work in is defined as FOLDER in each script along with other important parameters
The file helper/feature_list.py defines a list of features that is finally used for model training

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
cluster		cluster
data		data
data_common		data_common
helper		helper
mf		mf
README.md		README.md
als.py		als.py
cluster.py		cluster.py
create_featureset.py		create_featureset.py
lgbm_cv.py		lgbm_cv.py
lgbm_test.py		lgbm_test.py
prep.py		prep.py
prep_songs.py		prep_songs.py
split.py		split.py