This project is a spam classification system that uses Multinomial Naive Bayes algorithm. The project involves pre-processing an adaptive dataset of emails using Natural Language Processing techniques such as tokenization, stop-word removal, and stemming. The pre-processed data is then used to train a logistic regression model. The model is based on bag-of-words approach, where each email is represented as a vector of word frequencies. The project uses Python programming language and its libraries including pandas, scikit-learn, and matplotlib for data processing, modeling, and visualization respectively. The evaluation of model accuracy is based on precision, recall, and F1-score metrics. The project includes data visualization of the most frequently used words in spam and ham emails.
-
Notifications
You must be signed in to change notification settings - Fork 0
An email spam classification system based on Multinomial Naive Bayes, it uses NLP techniques to pre-process a dataset of spam/ham emails and trains a logistic regression model to be able to predict whether new emails are spam/ham. Evaluation of model accuracy is based on precision, recall, and F1-score metrics.
License
AvichalS/Email-Spam-Filtering
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
About
An email spam classification system based on Multinomial Naive Bayes, it uses NLP techniques to pre-process a dataset of spam/ham emails and trains a logistic regression model to be able to predict whether new emails are spam/ham. Evaluation of model accuracy is based on precision, recall, and F1-score metrics.
Topics
Resources
License
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published