prima.cpp: Speeding up 70B-scale LLM inference on low-resource everyday home clusters
-
Updated
Mar 30, 2025 - C++
prima.cpp: Speeding up 70B-scale LLM inference on low-resource everyday home clusters
Multi-agent workflows with Llama3: A private on-device multi-agent framework
Add a description, image, and links to the on-device-llms topic page so that developers can more easily learn about it.
To associate your repository with the on-device-llms topic, visit your repo's landing page and select "manage topics."