Skip to content

Latest commit

 

History

History
50 lines (44 loc) · 1.64 KB

File metadata and controls

50 lines (44 loc) · 1.64 KB

Transformer BERT SMS Spam Detection

Google Cloud Platform - https://nlp-sms-spam-detection.wm.r.appspot.com/
Heroku - https://spam-sms-detect-nlp.herokuapp.com/


Dataset

https://www.kaggle.com/uciml/sms-spam-collection-dataset

Libraries Used

1. Flask
2. gunicorn
3. itsdangerous
4. Jinja2
5. MarkupSafe
6. Werkzeug
7. Pillow
8. Pickle
9. NLTK
10. Numpy
11. Scikit-learn
12. Pandas
13. Seaborn
14. Joblib
15. Matplotlib
16. HTML
17. CSS
18. Bootstrap
19. JavaScript

Project Walkthrough

1. Exploratory Data Analysis(EDA)
2. Data Cleaning
3. Data Manipulation
4. Feature Engineering
5. Applied Stemming and Lemmatization techniques (Snowball Stemmer, Porter Stemmer, and Wordnet Lemmatizer)
6. Implemented Bag of Words model on the dataset
7. Implemented TF | IDF
8. Model Building - Used Multinomial Naive Bayes and Light GBM Classifier. Achieved 94% F1 Score after hyperparameter tuning with Multinomial Naive Bayes
9. Exported Multinomial Naive Bayes Classifier model using Joblib library
10. Implemented DistilBERT - a hugging face transformer model
11. Fine Tuned DistilBERT
12. Developed Front End
13. Created a flask server and deployed the code
14. Web-app working successfully in Google Cloud Platform and Heroku

Email - tejasta@gmail.com
LinkedIn - https://www.linkedin.com/in/tejas-ta/
Blogs - https://tejasta.medium.com/