Skip to content

Latest commit

 

History

History
31 lines (18 loc) · 1.56 KB

README.md

File metadata and controls

31 lines (18 loc) · 1.56 KB

Linear Regression On Medical-Insurance Dataset

Linear Regression on Medical Insurance Dataset

📍Introduction

This notebook looks at the costs of medical insurance using a simple Linear regression ML algorithm. the data consists of 1338 entries.

About Dataset 📑

This Data is a pratical is used in the book Machine Learning with R by Brett Lantz; which is a book that provides an introduction to machine learning using R. All of these datasets are in the public domain but simply needed some cleaning up and recoding to match the format in the book. The following data obtained from Kaggle, explain the cost of a small sample of USA population Medical Insurance Cost based on some attributes depicted on "Columns".

Columns

age: age of primary beneficiary

sex: insurance contractor gender, female, male

bmi: Body mass index, providing an understanding of body, weights that are relatively high or low relative to height, objective index of body weight (kg / m ^ 2) using the ratio of height to weight, ideally 18.5 to 24.9

children: Number of children covered by health insurance / Number of dependents

smoker: Smoking

region: the beneficiary's residential area in the US, northeast, southeast, southwest, northwest.

charges: Individual medical costs billed by health insurance

Problem Statement 📝

Can you accurately predict insurance costs?

Let's see how we can implement it... 😄