Spam Email using Naive Bayes

Model Used: - MultinomialNB
- BernoulliNB
STEP:
- Firsly, We removed the Punctuations (period, comma, apostrophe, quotation, question, exclamation, brackets, braces, parenthesis, dash, hyphen, ellipsis, colon, semicolon, etc) and Stopwords (a, the, is, are, and, etc).
- Secondly, Lemmatization. It's a technique used to reduce words to their basic form or root form. For example, in lemmatization:
"running" becomes "run."
"better" becomes "good."
"wolves" becomes "wolf."
- Then, TF-IDF Vectorizer - TF (Term Frequency): It measures the frequency of a word in a document, indicating how often a word appears in a specific document.
- IDF (Inverse Document Frequency): It measures the importance of a word across a collection of documents, highlighting the uniqueness of a word in the entire corpus.
- TF-IDF Score: It combines TF and IDF to represent the importance of a word in a document and across the corpus. - Finally, we fit the model and Evaluate the performance.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
README.md		README.md
Spam Email detection.ipynb		Spam Email detection.ipynb
text messages for spam mail.csv		text messages for spam mail.csv

Provide feedback