Portfolio of ongoing and completed datascience projects by me as a part of academic or self learning path. Presented in the form of iPython Notebooks
-
Exploratory data analysis and visual exploration
-
Gender Gap: A visual analysis on the gender gap in different academic degrees over the years
-
Police activity in Florida: An exploratory data analysis done on a dataset collected from stanford open data. The notebook shows how police activity changes based on different factors
-
-
Machine Learning
- Box office revenue prediction: Feature engineering techniques were implemented to create several new features from text data. Feature selection methods were used on over 3000 variables to run regression model on the transformed data.
- Prediction of loan default: Exploratory data analysis was done on a loan applications’ dataset. Extensive cleaning was done and gradient boosting classifiers were used in a highly inbalanced dataset to predict whether a loan goes default or not. PS: The notebook is yet to be updated
- Daily Bike rental prediction: A machine learning model to predict the number of bikes rented on daily basis
- Analysis of residential complaints: Extensive data exploration and visual analysis was done on a dataset released by NYC open data to answer a set of questions. Classifiers were applied to predict the type of complaint filed by the resident
- Sales prediction: A simple linear regression model to predict the store sales of big mart
- Heart disease: A logistic regression project to predict the class of heart disease using a sample of patient data