llama2 #459
Labels
AI-Chatbots
Topics related to advanced chatbot platforms integrating multiple AI models
base-model
llm base models not finetuned for chat
llm-quantization
All about Quantized LLM models and serving
openai
OpenAI APIs, LLMs, Recipes and Evals
Llama 2
The most popular model for general use.
265.8K Pulls
Updated 4 weeks ago
Overview
Llama 2 is released by Meta Platforms, Inc. This model is trained on 2 trillion tokens, and by default supports a context length of 4096. Llama 2 Chat models are fine-tuned on over 1 million human annotations, and are made for chat.
CLI
Open the terminal and run
API
Example using curl:
API documentation
Memory requirements
If you run into issues with higher quantization levels, try using the q4 model or shut down any other programs that are using a lot of memory.
Model variants
Chat: fine-tuned for chat/dialogue use cases. These are the default in Ollama, and for models tagged with
-chat
in the tags tab.Example:
ollama run llama2
Pre-trained: without the chat fine-tuning. This is tagged as
-text
in the tags tab.Example:
ollama run llama2:text
By default, Ollama uses 4-bit quantization. To try other quantization levels, please use the other tags. The number after the
q
represents the number of bits used for quantization (i.e.q4
means 4-bit quantization). The higher the number, the more accurate the model is, but the slower it runs, and the more memory it requires.References
Suggested labels
{ "label-name": "llama2-model", "description": "A powerful text model for chat, dialogue, and general use.", "repo": "ollama.ai/library/llama2", "confidence": 91.74 }
The text was updated successfully, but these errors were encountered: