HandSpeak: Real-Time American Sign Language Recognition System

HandSpeak is a real-time American Sign Language (ASL) recognition system that uses computer vision and deep learning to identify hand gestures and translate them into text. The system leverages hand landmarks, ResNet for group classification, and multiple MLP models for specific letter predictions.

Features

Real-Time Gesture Recognition: Detects and classifies ASL gestures in real-time using a webcam.
Hierarchical Classification: Uses a ResNet model to classify gestures into broad groups and MLP models for specific letter predictions.
Landmark-Based Detection: Utilizes hand landmarks for robust recognition across varying lighting conditions and backgrounds.
GUI Integration: Displays recognized letters and words in a user-friendly Tkinter-based interface.

Hand Gestures

Different Fingerspelling in this project

Outcome Video

Watch the outcome video demonstrating the HandSpeak system in action:

outcomevid.mp4

Project Structure

HandSpeak/
├── main.py                # Entry point for the application
├── src/
│   ├── handspeak.py       # Core logic for gesture detection and classification
│   ├── models/
│   │   ├── resnet.py      # ResNet model for group classification
│   │   ├── mlp.py         # MLP models for specific letter prediction
│   ├── notebooks/
│   │   ├── prepare_data.ipynb  # Data preparation and collection
│   │   ├── train_model.ipynb   # Model training scripts
│   ├── utils/
│   │   ├── measure_funcs.py    # Utility functions for distance and angle calculations
│   ├── labels.json        # Encoded labels for gesture groups

Installation

Clone the repository:

git clone https://github.com/yourusername/HandSpeak.git
cd HandSpeak

Create a virtual environment and activate it:

python -m venv venv
source venv/bin/activate  # On Windows use `venv\Scripts\activate`

Install the required packages:
```
pip install -r requirements.txt
```

Usage

Prepare the dataset by following the instructions in the Data Preparation section.
Train the model by following the instructions in the Training the Model section.
Run the application:
```
python main.py
```

Data Preparation

Open the src/notebooks/prepare_data.ipynb notebook.
Follow the instructions to create directories for training and testing data.
Collect hand gesture images using OpenCV and process them before saving.

Training the Model

Open the src/notebooks/train_model.ipynb notebook.
Follow the instructions to load the dataset, define the model, and train it.
Save the trained model weights.

Models

ResNet

Used for classifying gestures into broad groups (AMNSTE, DFBUVLKRW, COPQZX, GHYJI).
Implements residual blocks for efficient feature extraction.

MLP

Predicts specific letters within each group.
Lightweight and optimized for real-time inference.

Labels

The labels.json file maps ASCII codes to gesture groups:

{
     "amnste": { "97": 0, "101": 1, "109": 2, "110": 3, "115": 4, "116": 5 },
     ...
}

Utilities

`measure_funcs.py`

euclidean_distance: Calculates the distance between two points.
calculate_angle: Computes the angle between two points.
is_above: Determines if one point is above another.

Data Collection Pipeline

The diagram below shows how images are collected, you can find more details in src/notebooks/prepare_data.ipynb.

Letter Prediction Pipeline

The two outputs of the previous section, landmark features and landmarks drawn onto a pure white image serve as inputs for this section to predict the specific letter.

ResNet Architecture

Created with PlotNeuralNet

MLPs Architecture

First layer (# features)-> 15 neurons Second layer (hidden layer) -> 128 neurons Third layer (hidden layer) -> 64 neurons Fourth layer (hidden layer) -> 32 neurons Final layer (predicted letters) -> depends on group letters, For example in this image there are 6 neurons in final layer

Conclusion

The training process demonstrates that the ResNet model effectively learns the dataset, achieving near-perfect accuracy on the test set.

The following table presents the final loss values for each of the MLP models after training:

Model Name	Final Loss
amnste	0.3044
dfbuvlkrw	0.3017
ghyji	0.0293
copqzx	1.6452

You can find more details in src/notebooks/train_model.ipynb.

To-Do

Optimize the data processing pipeline.
Explore alternative model architectures.
Use MediaPipe to get full focus on the hand by detecting hand landmarks, cropping the hand region, and ensuring it remains the primary focus in the frame.
Better word transition between predicts.
Add more and better numeric features to get better accuracy for each MLP model.
Add AutoCorrect system for typed text.

Contributing

Contributions are welcome! Feel free to submit issues or pull requests.

License

This project is licensed under the MIT License.

Name		Name	Last commit message	Last commit date
Latest commit History 49 Commits
assets		assets
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

HandSpeak: Real-Time American Sign Language Recognition System

Features

Table of Contents

Hand Gestures

Outcome Video

Project Structure

Installation

Usage

Data Preparation

Training the Model

Models

ResNet

MLP

Labels

Utilities

`measure_funcs.py`

Data Collection Pipeline

Letter Prediction Pipeline

ResNet Architecture

MLPs Architecture

Conclusion

To-Do

Contributing

License

About

Releases

Packages

Contributors 2

Languages

License

Estaheri7/HandSpeak

Folders and files

Latest commit

History

Repository files navigation

HandSpeak: Real-Time American Sign Language Recognition System

Features

Table of Contents

Hand Gestures

Outcome Video

Project Structure

Installation

Usage

Data Preparation

Training the Model

Models

ResNet

MLP

Labels

Utilities

measure_funcs.py

Data Collection Pipeline

Letter Prediction Pipeline

ResNet Architecture

MLPs Architecture

Conclusion

To-Do

Contributing

License

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

`measure_funcs.py`

Packages