Project made in Jupyter Notebook with Kaggle Titanic dataset, which aims at detailed data analysis and prediction of which passengers survived the sinking of the Titanic.
pclass: A proxy for socio-economic status (SES)
1st = Upper
2nd = Middle
3rd = Lower
age: Age is fractional if less than 1. If the age is estimated, is it in the form of xx.5
sibsp: The dataset defines family relations in this way...
Sibling = brother, sister, stepbrother, stepsister
Spouse = husband, wife (mistresses and fiancés were ignored)
parch: The dataset defines family relations in this way...
Parent = mother, father
Child = daughter, son, stepdaughter, stepson
Some children travelled only with a nanny, therefore parch=0 for them.
- Cleaning Data
- Statistical Inference
- Exploratory Data Analysis (EDA)
- Data Visualization
- Supervised Machine Learning Algorithms: Logistic Regression, Random Forest, Naive Bayes, K-nearest Neighbors, SVC
- Python 3.8.8
- Pandas 1.2.4
- Matplotlib 3.3.4
- Seaborn 0.11.1
- Sklearn 0.24.1