LSTM Scratch Implementation

This repository contains a scratch implementation of Long Short-Term Memory (LSTM) networks in C++. The project aims to help understand the inner workings of LSTMs by building them from the ground up without using high-level libraries. The LSTM model is trained for a character-level recognition task using a baby names dataset from Kaggle.

Features

Implementation of LSTM from scratch in C++ (It includes Embedding layer for input embedding and Output passed through a softmax function).
Custom dataset loading and preprocessing.
Training and evaluation scripts.
Visualization of training metrics.

File Structure

data/: Contains the dataset files.
pretrained/: Directory for saving pretrained models.
dataset.cpp: Handles dataset loading and preprocessing.
utils.cpp: Collects the first n samples from the dataset and handles name preprocessing and encoding.
model.cpp: Contains the LSTM model implementation, including initialization, forward propagation, backward propagation, parameter optimization, and loss & metrics calculation.
train.cpp: Script for training the LSTM model and saving weights and metrics.
plot.cpp: Handles visualization of training metrics using matplotlib.

Installation

Prerequisites

C++ compiler (supporting C++17)
CMake
Matplotlib (for C++)

Setup Instructions

Clone the repository:

git clone https://github.com/binguliki/LSTM-Scratch-Implementation.git
cd LSTM-Scratch-Implementation

Prepare the dataset:

cd data
g++ utils.cpp -std=c++17 -o utils
./utils
cd ..

Train the model: Adjust the number of training instances if necessary, then compile and run the training script.
```
g++ train.cpp -std=c++17 -o train
./train
```
Note: Training might take up to 10-15 minutes.
Plot the training metrics: Update the paths in CMakeLists.txt according to the Python installation on your system.
```
cd build
cmake ..
make
./plot_graph
```
Make sure to replace file paths where necessary.

Usage

Load and preprocess the dataset: The utils.cpp file is used to load the first n samples from the dataset and preprocess them into one-hot vector representations.
Train the LSTM model: The train.cpp file handles the training process, saving weights and metrics of the trained model.
Visualize training metrics: The plot.cpp file uses a matplotlib wrapper to visualize the performance of the model during training and print the test accuracy.

Pretrained Model

The model has been trained on 12,000 instances and tested on 2,000 instances. The weights of the trained model are provided in the pretrained/ directory.

Images

Utils file Output:
Training Output:
Performance
Testing Output

Contributing

Contributions are welcome! Please open an issue or submit a pull request for any improvements or bug fixes.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LSTM Scratch Implementation

Features

File Structure

Installation

Prerequisites

Setup Instructions

Usage

Pretrained Model

Images

Contributing

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
Images		Images
data		data
pretrained		pretrained
CmakeLists.txt		CmakeLists.txt
README.md		README.md
dataset.cpp		dataset.cpp
matplotlibcpp.h		matplotlibcpp.h
model.cpp		model.cpp
plot.cpp		plot.cpp
train.cpp		train.cpp

binguliki/LSTM-Scratch-Implementation

Folders and files

Latest commit

History

Repository files navigation

LSTM Scratch Implementation

Features

File Structure

Installation

Prerequisites

Setup Instructions

Usage

Pretrained Model

Images

Contributing

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages