Skip to content

An email spam classification system based on Multinomial Naive Bayes, it uses NLP techniques to pre-process a dataset of spam/ham emails and trains a logistic regression model to be able to predict whether new emails are spam/ham. Evaluation of model accuracy is based on precision, recall, and F1-score metrics.

License

Notifications You must be signed in to change notification settings

AvichalS/Email-Spam-Filtering

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Email-Spam-Filtering

This project is a spam classification system that uses Multinomial Naive Bayes algorithm. The project involves pre-processing an adaptive dataset of emails using Natural Language Processing techniques such as tokenization, stop-word removal, and stemming. The pre-processed data is then used to train a logistic regression model. The model is based on bag-of-words approach, where each email is represented as a vector of word frequencies. The project uses Python programming language and its libraries including pandas, scikit-learn, and matplotlib for data processing, modeling, and visualization respectively. The evaluation of model accuracy is based on precision, recall, and F1-score metrics. The project includes data visualization of the most frequently used words in spam and ham emails.

About

An email spam classification system based on Multinomial Naive Bayes, it uses NLP techniques to pre-process a dataset of spam/ham emails and trains a logistic regression model to be able to predict whether new emails are spam/ham. Evaluation of model accuracy is based on precision, recall, and F1-score metrics.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages