Skip to content

KasrAskari/Medical-Insurance

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 

Repository files navigation

Medical Insurance Cost Prediction

Overview

This project focuses on predicting individual medical insurance costs using demographic and health-related features. By leveraging machine learning models, the repository provides insights into how factors such as age, BMI, and smoking status affect insurance premiums.


Features

  • Data Preprocessing: Handling missing values, encoding categorical data, and feature scaling.
  • Exploratory Data Analysis (EDA): Visualizing trends and correlations between features and insurance costs.
  • Model Training: Implementing machine learning algorithms to predict insurance charges.
  • Performance Metrics: Evaluating the accuracy and reliability of models using metrics like Mean Absolute Error (MAE).

Project Structure

Medical-Insurance/
├── data/                  # Dataset for training and testing
├── notebooks/             # Jupyter notebooks for EDA and model development
├── scripts/               # Python scripts for preprocessing and model training
├── visualizations/        # Charts and graphs for insights
├── models/                # Trained machine learning models
├── README.md              # Project documentation
└── LICENSE                # License information

Technologies Used

  • Python: Core programming language.
  • Pandas: For data manipulation and preprocessing.
  • Matplotlib/Seaborn: Visualizing relationships between features.
  • Scikit-learn: Building and evaluating machine learning models.
  • NumPy: Efficient numerical computations.

Dataset

The dataset includes the following features:

  • Age: Age of the individual.
  • Sex: Gender (male/female).
  • BMI: Body mass index.
  • Children: Number of dependents.
  • Smoker: Whether the individual is a smoker.
  • Region: Geographical region.
  • Charges: Medical insurance costs (target variable).

The dataset can be sourced from platforms such as Kaggle.


Resources

For further reading and reference:

  1. Medical Cost Personal Dataset - Kaggle
  2. Scikit-learn Documentation
  3. Exploratory Data Analysis Guide