-
-
modded-nanogpt Public
Forked from KellerJordan/modded-nanogptNanoGPT (124M) in 5 minutes
-
-
-
ProX Public
Forked from GAIR-NLP/ProXOffical Repo for "Programming Every Example: Lifting Pre-training Data Quality Like Experts at Scale"
Python Apache License 2.0 UpdatedOct 11, 2024 -
torchtitan Public
Forked from pytorch/torchtitanA native PyTorch Library for large model training
Python BSD 3-Clause "New" or "Revised" License UpdatedSep 13, 2024 -
mini-omni Public
Forked from gpt-omni/mini-omniopen-source multimodal large language model that can hear, talk while thinking. Featuring real-time end-to-end speech input and streaming audio output conversational capabilities.
Python MIT License UpdatedSep 4, 2024 -
grouped_gemm Public
Forked from tgale96/grouped_gemmPyTorch bindings for CUTLASS grouped GEMM.
Cuda Apache License 2.0 UpdatedAug 26, 2024 -
WaveCoder Public
Forked from microsoft/WaveCoderAdvancing LLM with Diverse Coding Capabilities
Python MIT License UpdatedAug 2, 2024 -
bigcode-evaluation-harness Public
Forked from bigcode-project/bigcode-evaluation-harnessA framework for the evaluation of autoregressive code generation language models.
Python Apache License 2.0 UpdatedJul 15, 2024 -
efficient_cross_entropy Public
Forked from mgmalek/efficient_cross_entropyPython MIT License UpdatedMay 28, 2024 -
triton Public
Forked from triton-lang/tritonDevelopment repository for the Triton language and compiler
C++ MIT License UpdatedMay 24, 2024 -
cs224n Public
Solutions to CS224n: Natural Language Processing with Deep Learning assignments.
-
zero-bubble-pipeline-parallelism Public
Forked from sail-sg/zero-bubble-pipeline-parallelismZero Bubble Pipeline Parallelism
Python Other UpdatedApr 29, 2024 -
transformers Public
Forked from huggingface/transformers🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
Python Apache License 2.0 UpdatedApr 11, 2024 -
scattermoe Public
Forked from shawntan/scattermoeTriton-based implementation of Sparse Mixture of Experts.
Python Apache License 2.0 UpdatedMar 14, 2024 -
ring-flash-attention Public
Forked from zhuzilin/ring-flash-attentionRing attention implementation with flash attention
Python UpdatedFeb 27, 2024 -
-
english-wordnet Public
Forked from globalwordnet/english-wordnetThe Open English WordNet
Python Other UpdatedDec 5, 2023 -
ml-engineering Public
Forked from stas00/ml-engineeringMachine Learning Engineering Guides and Tools
Python Creative Commons Attribution Share Alike 4.0 International UpdatedNov 8, 2023 -
text-dedup Public
Forked from ChenghaoMou/text-dedupAll-in-one text de-duplication
Jupyter Notebook Apache License 2.0 UpdatedNov 4, 2023 -
gpt-neox Public
Forked from EleutherAI/gpt-neoxAn implementation of model parallel autoregressive transformers on GPUs, based on the DeepSpeed library.
Python Apache License 2.0 UpdatedSep 5, 2023 -
Megatron-LLM Public
Forked from epfLLM/Megatron-LLMdistributed trainer for LLMs
Python Other UpdatedSep 1, 2023 -
rerope Public
Forked from bojone/reropeRectified Rotary Position Embeddings
Python UpdatedAug 7, 2023 -
flash-attention Public
Forked from Dao-AILab/flash-attentionFast and memory-efficient exact attention
Python BSD 3-Clause "New" or "Revised" License UpdatedJul 29, 2023 -
-
contriever Public
Forked from CarperAI/contrieverContriever: Unsupervised Dense Information Retrieval with Contrastive Learning
Python Other UpdatedJul 3, 2023 -
dynamic-sparse-flash-attention Public
Forked from epfml/dynamic-sparse-flash-attentionJupyter Notebook Other UpdatedJun 2, 2023 -
goodreads Public
Forked from MengtingWan/goodreadscode samples for the goodreads datasets
Jupyter Notebook Apache License 2.0 UpdatedMay 29, 2023 -
DeepSpeed Public
Forked from deepspeedai/DeepSpeedDeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
Python Apache License 2.0 UpdatedMay 27, 2023