-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add cancel() method to interrupt a stream #733
base: main
Are you sure you want to change the base?
Conversation
please accept this pr @abetlen |
Actually.. I found an issue with this method.. this will only cancel after a token is generated but if the llm is slow or gets stuck processing the prompt, this doesn't cancel it.. We need a better method. |
I'm coming back to this because I need to figure out a better method to interrupt the generation programmatically.. For a console-based scenario it's pretty easy in python, all I have to do is surround the code with try except KeyboardInterrupt: .. then I can just press ctrl+c at any point to gracefully interrupt the llm.. But.. if I'm using a front-end user interface, I haven't managed to make it work properly let's say with a button "Stop generating" that can call a python function.. because of the issue I mentioned in the previous post.. @abetlen sorry to bother again but do you have any suggestions/ideas on how to accomplish this? |
8c93cf8
to
cc0fe43
Compare
Why not add it now and improve if there is a better solution. For now this would work in most cases. |
has anyone found a reasonable solution for this? Or am I the only one not willing to wait until the model finishes without killing the job and losing context? |
Any chance this gets merged for now? |
It indeed blocks until the first token is produced, but cancelling it after that is trivial. The other similar issue is cancelling a model that is loading. |
gpt4all python bindings offer a similar way which allows stopping with the next token |
+1 can we merge this? |
Take a look at ggerganov/llama.cpp#10509 which should permanently solve this problem on lcpp's side |
Fixes #599.
Thanks for all your work on this project!