Skip to content

Latest commit

 

History

History
27 lines (20 loc) · 1.46 KB

README.md

File metadata and controls

27 lines (20 loc) · 1.46 KB

Spam-Email-Classification

Project Overview

In the project, the objective was to develop a machine learning model to classify emails as either spam or non-spam (ham). Email spam classification is a common problem in natural language processing (NLP) and has significant applications in email filtering systems.

Problem Statement

The goal of the project was to build a classifier that can accurately differentiate between spam and non-spam emails. That involves preprocessing the email text data, extracting relevant features, training a classification model, and evaluating its performance.

Dataset

Used a publicly available dataset containing labeled emails, where each email is classified as spam or ham. The dataset consists of both the email text and corresponding labels.

Approach

The approach involved the following steps:

  1. Imported necessary libraries for data processing and model building.
  2. Data preparation, including loading the dataset, cleaning, and preprocessing.
  3. Feature extraction to convert the text data into numerical features.
  4. Model Trained using a classification algorithm.
  5. Evaluated model's performance using appropriate metrics.

Results:

Achieved a accuracy of 99.19% on the test dataset, indicating the model's ability to accurately classify emails.

Technologies Used:

Python, pandas, scikit-learn, Jupyter Notebook.

Skills Demonstrated:

Data preprocessing, feature extraction, classification modeling, model evaluation.