Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
-
Updated
Apr 11, 2025 - Python
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
A high-throughput and memory-efficient inference and serving engine for LLMs
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
中文LLaMA&Alpaca大语言模型+本地CPU/GPU训练部署 (Chinese LLaMA & Alpaca LLMs)
SGLang is a fast serving framework for large language models and vision language models.
Easy-to-use and powerful LLM and SLM library with awesome model zoo.
Low-code framework for building custom LLMs, neural networks, and other AI models
Run any open-source LLMs, such as DeepSeek and Llama, as OpenAI compatible API endpoint in the cloud.
🌸 Run LLMs at home, BitTorrent-style. Fine-tuning and inference up to 10x faster than offloading
BISHENG is an open LLM devops platform for next generation Enterprise AI applications. Powerful and comprehensive features include: GenAI workflow, RAG, Agent, Unified model management, Evaluation, SFT, Dataset Management, Enterprise-level System Management, Observability and more.
Semantic cache for LLMs. Fully integrated with LangChain and llama_index.
Replace OpenAI GPT with another LLM in your app by changing a single line of code. Xinference gives you the freedom to use any LLM you need. With Xinference, you're empowered to run inference with any open-source language models, speech recognition models, and multimodal models, whether in the cloud, on-premises, or even on your laptop.
Add a description, image, and links to the llama topic page so that developers can more easily learn about it.
To associate your repository with the llama topic, visit your repo's landing page and select "manage topics."