AI Generated Text Detection

This project aims to classify text as either human-generated or AI-generated. It utilizes a variety of natural language processing (NLP) features and machine learning algorithms to achieve this classification task.

Features

The following features are extracted from the provided dataset:

Basic NLP Features:
- Char count, word count, word density, punctuation count, title word count, upper-case count, noun count, adverb count, verb count, adjective count, pronoun count.
Term Frequencies and N-gram:
- Count vectorizer with 35742 features.
- Bigram words (5000 features).
- Trigram words (5000 features).
- BiTrigram characters (5000 features).
Topic Modeling:
- NeuralLDA with 20 topics.
Others:
- Readability score, Named Entity Recognition (NER) count, text error length, and Lexical Diversity.

Feature Selection

After feature extraction, Principal Component Analysis (PCA) is applied with n_components set to 256 for feature selection.

Algorithms

The project utilizes five different algorithms for training and testing:

Random Forest
Support Vector Machine (SVM)
XGBoost
Gradient Boosting
Logistic Regression

Performance

Among the five algorithms tested, Gradient Boosting demonstrated superior performance. It provided accurate classification results during the prediction phase.

Flask Application

A simple Flask application is developed to demonstrate the functionality of the AI Generated Text Detection model. Users can input text, and the application will classify it as either human-generated or AI-generated.

Usage

To use the project:

Clone the repository from GitHub.
Install the required dependencies.
Run the Flask application.
Input text to classify whether it is human-generated or AI-generated.

Contributors

Habeeb Moosa - Project Lead & Developer
Hanisha Musangi - Frontend Developer

License

This project is licensed under the MIT License.

Name		Name	Last commit message	Last commit date
Latest commit History 40 Commits
app		app
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
features.ipynb		features.ipynb
preprocess.ipynb		preprocess.ipynb
train.ipynb		train.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AI Generated Text Detection

Features

Feature Selection

Algorithms

Performance

Flask Application

Usage

Contributors

License

About

Releases

Packages

Languages

License

habeebmoosa/ai-text-detector

Folders and files

Latest commit

History

Repository files navigation

AI Generated Text Detection

Features

Feature Selection

Algorithms

Performance

Flask Application

Usage

Contributors

License

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages