transformer_cpp

Overview

LLaMA2 model inference using pure C++ STD only. This project aims to provide a transparent, debuggable approach to understanding LLMs through a sequential, readable codebase (since I found parallelism to be difficult to interpret).

Motivation

The primary goal is to:

Deeply understand LLM internals
Facilitate research, optimization and performance improvements.
Create a clean, step-by-step implementation for learning and debugging

Installation

Start by downloading the Llama2 model https://huggingface.co/meta-llama/Llama-2-7b

Export Model and Tokenizer

Export the tokenizer to binary format:

python utils/export_tokenizer.py --input /path/to/tokenizer.model --output /path/to/tokenizer.bin

Export the quantized model:

python utils/export_model.py --model_path /path/to/llama/weights/folder --output /path/to/model_q80.bin

Inference

mkdir build
cd build
cmake .. && make
./transformer_cpp <prompt> <tokenizer_path> <model_path>

Concrete example:

./transformer_cpp hi ../llama2_weights/tokenizer.bin ../llama2_weights/llama2_q80.bin

Credits

Tokenizer and weight loading adapted from Andrej Karpathy's llama2.c
Core implementation written from scratch

Roadmap

TODO

Quantization
Greedy sampling
KV caching
Advanced sampling techniques
- Top-k sampling
- Top-n sampling
Performance optimizations
- Speculative decoding
- Parallelizing attention heads
- Optimized matrix multiplication
- Parallel token processing
Advanced features
- Backward propagation
- CUDA matrix multiplication kernel

Getting Started

Prerequisites

C++ compiler with C++17 support
CMake

Building the Project

git clone https://github.com/yourusername/transformer_cpp.git
cd transformer_cpp
mkdir build && cd build
cmake ..
make

Contributing

Contributions, issues, and feature requests are welcome!

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
utils		utils
CMakeLists.txt		CMakeLists.txt
LICENSE.md		LICENSE.md
README.md		README.md
main.cpp		main.cpp
test.cpp		test.cpp
tokenizer.cpp		tokenizer.cpp
tokenizer.h		tokenizer.h

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

transformer_cpp

Overview

Motivation

Installation

Export Model and Tokenizer

Inference

Credits

Roadmap

TODO

Getting Started

Prerequisites

Building the Project

Contributing

About

Releases

Packages

Languages

License

projektjoe/Transformer.cpp

Folders and files

Latest commit

History

Repository files navigation

transformer_cpp

Overview

Motivation

Installation

Export Model and Tokenizer

Inference

Credits

Roadmap

TODO

Getting Started

Prerequisites

Building the Project

Contributing

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages