-
Notifications
You must be signed in to change notification settings - Fork 22
Host and Run Entire LLM Models Directly in the Browser Locally
fingerthief edited this page May 11, 2024
·
3 revisions
-
WebLLM is a modular and customizable javascript package that directly brings language model chats directly onto web browsers with hardware acceleration. Everything runs inside the browser with no server support and is accelerated with WebGPU.
-
WebLLM is fully compatible with OpenAI API. That is, you can use the same OpenAI API on any open source models locally.
All made possible by the great work done by the folks at WebLLM Github Repo
- Microsoft Edge
- Google Chrome
- Generally, Chromium based browsers (YMMV)
- Firefox Nightly Build (50/50 if it works)
- Downloading and consequentially caching full models can take up a chunk of storage space so be aware of that.
- Performance is highly dependent on the specs of the machine attempting to load/host the model