Toxic Comment Classification using Flask & AWS 🔍

This is a toxic comment classifier web application that uses a trained Logistic Regression model to predict the toxicity levels of a given text input.

Link to the web app: 👉🏻 Toxic Comment Classifier

Disclaimer: the dataset for this project contains text that may be considered profane, vulgar, or offensive.

🧐 About

This is a multi-label classification problem where the given input is a text comment and the output is list of the toxicity level it belongs to.

The input text data needs to be cleaned and pre-processed for it to be useful for the Machine Learning model.

📊 Dataset Overview

The dataset for this problem was taken from competetiion hosted by Jigsaw on Kaggle.

For preprocessing of the input data and text vectorization, both word and char based TF-IDF vectorizer's output are used as inputs to the model for better performance and minimum loss of input features.

The different types of target labels present are: toxic, severe-toxic, obscene, threat, insult and identity hate.

Click to view 👇:

🧠 Model Building

For building the classifier, we have used Logistic Regression and treated the multi-label problem as a binary problem. The reason for this approach instead of a OneVsRest Classifier is because of better model performance when the problem is treated as a binary one.

Since the data is unbalanced, just accuary in itself cannot be considered as a strong evaluater, therefore we have used F1-score along with it to evaluate the model.

Here are the results on validation and test datasets:

Validation Results 👇🏻

Validation Accuracy: 0.9828502793879577
Validation F1-Score: 0.9811947440446507

Test Results 👇🏻

Test Accuracy: 0.9752805651942854
Test F1-Score: 0.9747181660736461

Click to view 👇:

🎯 Getting Started

Project Structure:

Volume serial number is D8B2-80F9
D:.
├───data
│   ├───cleaned-data
│   └───raw-data
├───images
├───models
├───notebooks
├───static
│   └───css
├───templates
└───__pycache__

These instructions will get you a copy of the project up and running on your local machine for development and testing purposes.

Prerequisites

Flask==1.1.2
joblib==1.0.1
nltk==3.6.1
numpy==1.20.1
pandas==1.2.4
scikit_learn==1.0.2
scipy==1.6.2
swifter==1.0.9

Installing

Use miniconda to download python 3.8 or higher and then

pip install -r requirements.txt

🎈 Usage

To run the website, navigate to main folder of the project

python app.py

The server will be at "localhost:5000".

Goto "localhost:5000" and after entering the comment click on classify to predict it's toxicity values.

🚀 Deployment

The model has been deployed on an EC2 instance on AWS. The IP has been made publicly accesible. Below is the link to the AWS webapp project portal:

Link: http://ec2-18-117-78-151.us-east-2.compute.amazonaws.com:8080/

🌟 Support

Please hit the ⭐button if you like this project. 😄

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Toxic Comment Classification using Flask & AWS 🔍

Link to the web app: 👉🏻 Toxic Comment Classifier

📝 Table of Contents

🧐 About

📊 Dataset Overview

Click to view 👇:

🧠 Model Building

Click to view 👇:

🎯 Getting Started

Prerequisites

Installing

🎈 Usage

🚀 Deployment

🌟 Support

Thank you!

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 31 Commits
__pycache__		__pycache__
data		data
images		images
models		models
notebooks		notebooks
static/css		static/css
templates		templates
LICENSE		LICENSE
README.md		README.md
app.py		app.py
model.py		model.py
requirements.txt		requirements.txt

License

vipul-shinde/toxic-comment-classification

Folders and files

Latest commit

History

Repository files navigation

Toxic Comment Classification using Flask & AWS 🔍

Link to the web app: 👉🏻 Toxic Comment Classifier

📝 Table of Contents

🧐 About

📊 Dataset Overview

Click to view 👇:

🧠 Model Building

Click to view 👇:

🎯 Getting Started

Prerequisites

Installing

🎈 Usage

🚀 Deployment

🌟 Support

Thank you!

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages