This project focuses on data preparation for predicting complications arising from Myocardial Infarction. The notebook includes data exploration, visualization, cleaning, and transformation processes that are essential for improving model performance.
-
Explore and Visualize the Dataset
- Identified patterns, trends, and anomalies
- Visualized key features of the data
-
Data Cleaning and Transformation
- Handled missing values
- Detected and treated outliers
- Transformed categorical and numerical features
- Performed feature engineering as necessary
-
Standard Data Preparation Tasks
- Data Cleaning: Identifying and correcting mistakes or errors in the data.
- Feature Selection: Identifying input variables that are most relevant to the task.
- Data Transforms: Changing the scale or distribution of variables.
- Feature Engineering: Deriving new variables from available data.
- Dimensionality Reduction: Creating compact projections of the data.
The dataset is available at the UCI Machine Learning Repository.
- A Jupyter notebook documenting:
- Business Understanding
- Data Understanding
- Data Exploration
- Data Preparation