Life Expectancy Prediction Project

This repository contains the code and resources for the Life Expectancy Prediction Project, which aims to predict life expectancy based on various health, economic, and social factors. This project is part of a data science portfolio and demonstrates end-to-end model development, including data preprocessing, exploratory data analysis (EDA), model training, evaluation, and interpretation.

Project Overview

Life expectancy is a critical measure of a country's health and development. This project uses a dataset containing various indicators to predict life expectancy. The main steps of the project include:

Data Preprocessing
Exploratory Data Analysis (EDA)
Model Selection and Training
Hyperparameter Tuning
Model Interpretation
Model Saving

Dataset

The dataset used in this project is 'Life Expectancy Data.csv', which contains the following columns:

Country
Year
Status
Life expectancy
Adult Mortality
Infant deaths
Alcohol
Percentage expenditure
Hepatitis B
Measles
BMI
Under-five deaths
Polio
Total expenditure
Diphtheria
HIV/AIDS
GDP
Population
Thinness 1-19 years
Thinness 5-9 years
Income composition of resources
Schooling

Project Structure

data/: Contains the dataset file 'Life Expectancy Data.csv'.
notebooks/: Jupyter notebooks for data analysis and model development.
models/: Directory to save trained models.
README.md: Project overview.

Key Components

Data Preprocessing

Code to handle missing values, encode categorical variables, and scale numerical features.

Exploratory Data Analysis (EDA)

Code to visualize the distribution of variables, correlation matrix, pair plots, and other insights.

Model Selection and Training

Code to train and evaluate multiple models (Linear Regression, Random Forest, Gradient Boosting) and select the best model based on performance metrics.

Hyperparameter Tuning

Code to perform hyperparameter tuning for the best model using GridSearchCV.

Model Interpretation

Code to interpret the model using feature importance and permutation importance.

Model Saving

Code to save the trained model using joblib for future use.

Usage

To run the project, follow these steps:

Preprocess the data.
Perform EDA to understand the data better.
Train multiple models and evaluate their performance.
Tune hyperparameters for the best model.
Interpret the model to understand feature importance.
Save the final model for future predictions.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Contact

For any questions or suggestions, feel free to open an issue or contact me at [subhro2002@gmail.com].

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
LICENSE		LICENSE
Life Expectancy Data.csv		Life Expectancy Data.csv
Preprocessed_Life_Expectancy_Data.csv		Preprocessed_Life_Expectancy_Data.csv
README.md		README.md
life-expectancy-analysis-and-prediction.ipynb		life-expectancy-analysis-and-prediction.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Life Expectancy Prediction Project

Project Overview

Dataset

Project Structure

Key Components

Data Preprocessing

Exploratory Data Analysis (EDA)

Model Selection and Training

Hyperparameter Tuning

Model Interpretation

Model Saving

Usage

License

Contact

About

Languages

License

shubhro2002/Life-Expectancy-Analysis-and-Prediction

Folders and files

Latest commit

History

Repository files navigation

Life Expectancy Prediction Project

Project Overview

Dataset

Project Structure

Key Components

Data Preprocessing

Exploratory Data Analysis (EDA)

Model Selection and Training

Hyperparameter Tuning

Model Interpretation

Model Saving

Usage

License

Contact

About

Topics

Resources

License

Stars

Watchers

Forks

Languages