Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

server : (webui) revamp Settings dialog, add Pyodide interpreter #11759

Merged
merged 16 commits into from
Feb 8, 2025

Conversation

ngxson
Copy link
Collaborator

@ngxson ngxson commented Feb 8, 2025

In this PR:

  • revamp Settings dialog, make it 2 columns
  • add the "Experimentals" section, currently having "Python interpreter"
  • add API for side panel, aka "Canvas" that can be extended to support other type of canvas in the future, the UX should be like Claude's canvas or ChatGPT's artifacts

The python interpreter uses pyodide under the hood, which is CPython compiled as webassembly. Due to big bundle size, this feature requires internet connection to download the pyodide and wasm file from CDN.

image

Test with a real world problem

image image

@ggerganov
Copy link
Member

Very cool stuff.

It seems to not handle errors atm:

image

@ngxson
Copy link
Collaborator Author

ngxson commented Feb 8, 2025

Ok thanks for testing, the exception should be handled now!

image

@ggerganov
Copy link
Member

How did you get it to generate the "Mortgage Calculator" title? Qwen Coder does not seem to do it.

@ngxson
Copy link
Collaborator Author

ngxson commented Feb 8, 2025

How did you get it to generate the "Mortgage Calculator" title? Qwen Coder does not seem to do it.

I'm using llama 3.1 8B, but it's quite random, it never does that again when I regenerate the message

@ggerganov
Copy link
Member

Maybe there should be a way to stop execution of long programs:

image

@ngxson
Copy link
Collaborator Author

ngxson commented Feb 8, 2025

Maybe there should be a way to stop execution of long programs:

Yeah right, it's also blocking main thread so the UX is not very good. But unfortunately it's quite complicated to fix, because this requires implementing a web worker: https://pyodide.org/en/stable/usage/webworker.html

I'll see if I can fix in this PR, but otherwise I think we can still release the first version without it, to be added later on.

@woof-dog
Copy link
Contributor

woof-dog commented Feb 8, 2025

Is there a way we can have llama-server serve these dependencies so the internet connection is not required? Like, some folder we can dump them into.

Maybe a future issue but would prefer to serve these directly

@ngxson
Copy link
Collaborator Author

ngxson commented Feb 8, 2025

Is there a way we can have llama-server serve these dependencies so the internet connection is not required

We can and it's trivial to do. But I'm not comfortable of shipping a binary containing 8MB of llama.cpp, 1MB of webui and 11MB of pyodide, it's literally a bloatware to many users who only uses llama-server via API.

@ngxson ngxson added the merge ready indicates that this may be ready to merge soon and is just holding out in case of objections label Feb 8, 2025
@ngxson ngxson merged commit 55ac8c7 into ggml-org:master Feb 8, 2025
46 checks passed
@MoonRide303
Copy link
Contributor

MoonRide303 commented Feb 9, 2025

Is there a way we can have llama-server serve these dependencies so the internet connection is not required

We can and it's trivial to do. But I'm not comfortable of shipping a binary containing 8MB of llama.cpp, 1MB of webui and 11MB of pyodide, it's literally a bloatware to many users who only uses llama-server via API.

Both points sound valid - local servers should be fully usable without connecting to the Internet, but keeping the core as small as possible is important, too. If it's a simple thing, then maybe you could just add some brief instructions how to run it this way?

@ngxson
Copy link
Collaborator Author

ngxson commented Feb 9, 2025

@MoonRide303 it's trivial to implement doen't mean end-user can easily do that. What I mean by trivial is that we as developer can download the vendor library to /public directory and compile llama-server with it.

A better way would be to do this completely on the web level and not at cpp level.

We can download the vendor library (in this case, pyodide wasm file) into browser's cache and store it there. User only need to download once, then next time we load from browser's cache.

Would appreciate if someone can play with this.

@woof-dog
Copy link
Contributor

I agree with @MoonRide303 and that was my original intention. For the average user they can download it from the pyodide website, store it in their browser's cache, whatever. For power users that want 100% control over the longevity/accessibility/privacy of their service and want to self host their llama.cpp with no outside dependencies (after setup) that can run forever, we should have some "somewhat easy" way for them to do that without having to modify code themselves.

For example, maybe we can hint which URL to load pyodide from in some server endpoint, where an absent value means just use the default, but if the user wants to locally serve pyodide they can change it.

tinglou pushed a commit to tinglou/llama.cpp that referenced this pull request Feb 13, 2025
…l-org#11759)

* redo Settings modal UI

* add python code interpreter

* fix auto scroll

* build

* fix overflow for long output lines

* bring back sticky copy button

* adapt layout on mobile view

* fix multiple lines output and color scheme

* handle python exception

* better state management

* add webworker

* add headers

* format code

* speed up by loading pyodide on page load

* (small tweak) add small animation to make it feels like claude
orca-zhang pushed a commit to orca-zhang/llama.cpp that referenced this pull request Feb 26, 2025
…l-org#11759)

* redo Settings modal UI

* add python code interpreter

* fix auto scroll

* build

* fix overflow for long output lines

* bring back sticky copy button

* adapt layout on mobile view

* fix multiple lines output and color scheme

* handle python exception

* better state management

* add webworker

* add headers

* format code

* speed up by loading pyodide on page load

* (small tweak) add small animation to make it feels like claude
arthw pushed a commit to arthw/llama.cpp that referenced this pull request Feb 26, 2025
…l-org#11759)

* redo Settings modal UI

* add python code interpreter

* fix auto scroll

* build

* fix overflow for long output lines

* bring back sticky copy button

* adapt layout on mobile view

* fix multiple lines output and color scheme

* handle python exception

* better state management

* add webworker

* add headers

* format code

* speed up by loading pyodide on page load

* (small tweak) add small animation to make it feels like claude
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
examples merge ready indicates that this may be ready to merge soon and is just holding out in case of objections server
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants