You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
During the generation of tokens I would like to stop when I encounter some condition that changes during the runtime, using Stream=True.
E.g. I would like to stop generation after 5 lines of generation.
Describe the solution you'd like
I would like a method on llm called stop(), or interrupt(), that forces the model to stop after the next token is generated, similar to CTRL+C in the regular llama.cpp
Describe alternatives you've considered
I have considered adding a newline as stop token, but I think this is not performant. Another way I can think of is changing the stop list after passing it the generation method, but that feels hacky.
The text was updated successfully, but these errors were encountered:
Is your feature request related to a problem? Please describe.
During the generation of tokens I would like to stop when I encounter some condition that changes during the runtime, using
Stream=True
.E.g. I would like to stop generation after 5 lines of generation.
Describe the solution you'd like
I would like a method on llm called
stop()
, orinterrupt()
, that forces the model to stop after the next token is generated, similar to CTRL+C in the regular llama.cppDescribe alternatives you've considered
I have considered adding a newline as stop token, but I think this is not performant. Another way I can think of is changing the
stop
list after passing it the generation method, but that feels hacky.The text was updated successfully, but these errors were encountered: