Transformer from Scratch

The Transformer architecture, introduced in the paper "Attention Is All You Need," has become a cornerstone of many natural language processing tasks, this project implements a Transformer model from scratch using PyTorch.

Model Architecture

The Transformer model consists of the following components:

Encoder
Decoder
Multi-Head Attention
Position-wise Feed-Forward Networks
Positional Encoding

Project Structure

src/: Contains the source code for the Transformer model
- model/: Transformer model components
- utils/: Utility functions for data processing
- train.py: Script for training the Transformer
- translate.py: Script for using the trained model for translation
tests/: Unit tests for model components
data/: Directory to store dataset files
vocab/: Directory to store vocabulary files

Setup

Clone the repository:

git clone https://github.com/yourusername/transformer-from-scratch.git
cd transformer-from-scratch

Create a virtual environment:
```
python -m venv transformer
```
Activate the virtual environment:
- On Windows: transformer\Scripts\activate
- On macOS and Linux: source transformer/bin/activate
Install dependencies:
```
pip install -r requirements.txt
```

Usage

Preparing the Data

Place your English sentences in data/english_sentences.txt and French sentences in data/french_sentences.txt.
Create vocabularies:
```
python src/create_vocab.py
```

Training the Model

To train the Transformer model, run:

python src/train.py

This will start the training process and save the model checkpoints in the saved_models directory.

Translating Sentences

After training the model, you can use it for translation:

python src/translate.py

Example usage in your code:

from src.translate import Translator

translator = Translator(
    model_path="saved_models/final_model.pth",
    src_vocab_path="vocab/english.model",
    tgt_vocab_path="vocab/french.model",
    device="cuda"  # or "cpu" if you don't have a GPU
)

english_sentence = "Hello, how are you?"
french_translation = translator.translate(english_sentence)
print(f"English: {english_sentence}")
print(f"French: {french_translation}")

Run in Docker

Building the Docker Images

docker-compose build

Running the Services

To run the training service:

docker-compose up transformer

To run the translation service:

docker-compose up translator

Testing [TODO - Ignore for now!]

To run the unit tests, execute:

python -m unittest discover tests

Customization

You can customize the model by modifying the hyperparameters in src/train.py. The main hyperparameters are:

src_vocab_size: Size of the source vocabulary
tgt_vocab_size: Size of the target vocabulary
d_model: Dimensionality of the model
num_heads: Number of attention heads
num_layers: Number of encoder and decoder layers
d_ff: Dimensionality of the feed-forward network
dropout: Dropout rate
max_seq_length: Maximum sequence length
batch_size: Batch size for training
num_epochs: Number of training epochs
learning_rate: Learning rate for the optimizer

Dependencies

This project requires the following main dependencies:

PyTorch
NumPy
tqdm
matplotlib
sentencepiece

For a complete list of dependencies, please refer to the requirements.txt file.

Contributing

Contributions to this project are welcome. Please feel free to submit a Pull Request.

License

This project is open source and available under the MIT License.

Acknowledgements

The Transformer architecture is based on the paper "Attention Is All You Need" by Vaswani et al.
This implementation was inspired by various open-source Transformer implementations and tutorials available in the community.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
.github/workflows		.github/workflows
assets		assets
src		src
tests		tests
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
docker-compose.yml		docker-compose.yml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Transformer from Scratch

Model Architecture

Project Structure

Setup

Usage

Preparing the Data

Training the Model

Translating Sentences

Run in Docker

Building the Docker Images

Running the Services

Testing [TODO - Ignore for now!]

Customization

Dependencies

Contributing

License

Acknowledgements

About

Releases 1

Packages

Languages

License

usamahz/transformer

Folders and files

Latest commit

History

Repository files navigation

Transformer from Scratch

Model Architecture

Project Structure

Setup

Usage

Preparing the Data

Training the Model

Translating Sentences

Run in Docker

Building the Docker Images

Running the Services

Testing [TODO - Ignore for now!]

Customization

Dependencies

Contributing

License

Acknowledgements

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages