IS843 Project via Google Cloud Platform, with Spark
Data: Austin Animal Shelter Center Adoption Outcome
Data source:
- https://www.kaggle.com/aaronschlegel/austin-animal-center-shelter-outcomes-and#aac_shelter_outcomes.csv
- https://www.petfinder.com/dog-breeds/
- https://www.petfinder.com/cat-breeds/
What is in this project:
- Prediction for dogs and cats: whether adopted or not
- Data cleaning and processing
- Exploratory data analysis
- Feature engineering
- Logistic Regression, Multilayer Perceptron Classifier, Support Vector Machine
- Accuracy, precision, recall, F1-score, ROC & AUC
- T test for coefficient significance
- Pipeline and cross-validation for hyperparameter tuning
- Model performance evaluation and comparison