CSE 4683 : Machine Learning and Soft Computing, Fall 2022, Emma Wade, Michelle Hardin, John Austin Reed
Data and Environment :
- environment.yml : conda environment, to create environment based on yaml https://edcarp.github.io/introduction-to-conda-for-data-scientists/04-sharing-environments/index.html
- Data available here: https://bitgrit.net/competition/11 and in project Canvas submission
Source Code :
- file-prep.py : prepares training and testing files including one-hot encoding of categorical variables, cycling encoding of time variables, lasso regression of image features, and joining all variables. output needed to run cnn-xgboost.py and XGBOOST.ipynb
- cnn-xgboost.py : hybrid model and CNN model
- XGBOOST.ipynb : XGBoost model
- ML_Plotting.ipynb : figure and comparisons scripts