- Model Used: - MultinomialNB
- BernoulliNB - STEP:
- Firsly, We removed the Punctuations (period, comma, apostrophe, quotation, question, exclamation, brackets, braces, parenthesis, dash, hyphen, ellipsis, colon, semicolon, etc) and Stopwords (a, the, is, are, and, etc).
- Secondly, Lemmatization. It's a technique used to reduce words to their basic form or root form. For example, in lemmatization:
"running" becomes "run."
"better" becomes "good."
"wolves" becomes "wolf."
- Then, TF-IDF Vectorizer - TF (Term Frequency): It measures the frequency of a word in a document, indicating how often a word appears in a specific document.
- IDF (Inverse Document Frequency): It measures the importance of a word across a collection of documents, highlighting the uniqueness of a word in the entire corpus.
- TF-IDF Score: It combines TF and IDF to represent the importance of a word in a document and across the corpus. - Finally, we fit the model and Evaluate the performance.
-
Notifications
You must be signed in to change notification settings - Fork 0
asayem172153/Spam-Email-using-Naive-Bayes
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
About
No description, website, or topics provided.
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published