Facial Expression Recognition using Vision Transformer (ViT)

Project Overview

This project implements a Facial Expression Recognition (FER) system using the Vision Transformer (ViT) architecture. The system is capable of classifying facial expressions into seven categories: happy, sad, angry, fearful, disgusted, surprised, and neutral. With the Vision Transformer, we achieved a 85% accuracy in recognizing these expressions.

Introduction

Facial expression recognition is a crucial task in computer vision, with applications ranging from human-computer interaction to psychological research. This project leverages the Vision Transformer (ViT) architecture, which has shown superior performance in various image classification tasks.

Features

High Accuracy: Achieved 85% accuracy in facial expression recognition.
ViT Architecture: Utilizes the latest Vision Transformer model for efficient image classification.
Seven Expression Classes: Recognizes seven different facial expressions.
Pre-trained Models: Uses pre-trained weights for faster convergence and improved performance.

Dataset

The model is trained on a standard facial expression dataset such as FER-2013, MMI Facial Expression Database and AffectNet. Ensure the datasets are downloaded and properly formatted. Here is an example of the dataset directory structure:

dataset/
    train/
        happy/
        sad/
        ...
    val/
        happy/
        sad/
        ...
    test/
        happy/
        sad/
        ...

Download the FER-2013 dataset: FER-2013 dataset
Download the MMI Facial Expression Database: MMI Facial Expression Database
Download the AffectNet dataset: AffectNet dataset

Model Architecture

The Vision Transformer (ViT) architecture is used for this project. ViT divides the input image into patches, linearly embeds each patch, and feeds the sequence of linear embeddings into a transformer encoder. The transformer processes the sequence and outputs class probabilities for facial expressions.

Results

Our model achieved the following results on the test set:

Accuracy: 85%
Precision: 82%
Recall: 81%
F1 Score: 80%

License

This project is licensed under the MIT License - see the LICENSE file for details.

Contributors

Contact

Mohammed Abdeldayem
LinkedIn

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
LICENSE		LICENSE
README.md		README.md
facial-expression-recognition-using-vit.ipynb		facial-expression-recognition-using-vit.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Facial Expression Recognition using Vision Transformer (ViT)

Project Overview

Table of Contents

Introduction

Features

Dataset

Model Architecture

Results

License

Contributors

Contact

About

Releases

Packages

Contributors 2

Languages

License

abdeldayem02/Facial-Expression-Recognition

Folders and files

Latest commit

History

Repository files navigation

Facial Expression Recognition using Vision Transformer (ViT)

Project Overview

Table of Contents

Introduction

Features

Dataset

Model Architecture

Results

License

Contributors

Contact

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages