-
Notifications
You must be signed in to change notification settings - Fork 10.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Native api hangs when providing system_prompt #3766
Comments
I tried using this instead and it seems to work {
"prompt": "You are an angry assistant named Bob that swears alot\nUser: What is your name ?",
"anti_prompt": "User:",
"assistant_name": "Assistant:",
"temperature": 0.7
} According to https://github.com/ggerganov/llama.cpp/blob/master/examples/server/README.md {
"system_prompt": {
"prompt": "Transcript of a never ending dialog, where the User interacts with an Assistant.\nThe Assistant is helpful, kind, honest, good at writing, and never fails to answer the User's requests immediately and with precision.\nUser: Recommend.........g \"Surely You're Joking, Mr. Feynman!\" and \"What Do You Care What Other People Think?\".\nUser:",
"anti_prompt": "User:",
"assistant_name": "Assistant:"
}
} If i try to load the system prompt via -spf argument it does generate some errors: prompt.json{
"system_prompt": {
"prompt": "Transcript of a never ending dialog, where the User interacts with an Assistant.\nThe Assistant is helpful, kind, honest, good at writing, and never fails to answer the User's requests immediately and with precision.\nUser: Recommend a nice restaurant in the area.\nAssistant: I recommend the restaurant \"The Golden Duck\". It is a 5 star restaurant with a great view of the city. The food is delicious and the service is excellent. The prices are reasonable and the portions are generous. The restaurant is located at 123 Main Street, New York, NY 10001. The phone number is (212) 555-1234. The hours are Monday through Friday from 11:00 am to 10:00 pm. The restaurant is closed on Saturdays and Sundays.\nUser: Who is Richard Feynman?\nAssistant: Richard Feynman was an American physicist who is best known for his work in quantum mechanics and particle physics. He was awarded the Nobel Prize in Physics in 1965 for his contributions to the development of quantum electrodynamics. He was a popular lecturer and author, and he wrote several books, including \"Surely You're Joking, Mr. Feynman!\" and \"What Do You Care What Other People Think?\".\nUser:",
"anti_prompt": "User:",
"assistant_name": "Assistant:"
}
} Errorllama server listening at http://0.0.0.0:8080 {"timestamp":1698173160,"level":"INFO","function":"main","line":2499,"message":"HTTP server listening","hostname":"0.0.0.0","port":8080} |
I'm working on a fix in #3767 So far it does not block on system prompt update, but somehow the anti prompt does not seem to work. |
Thanks for the update |
Branch {
"prompt": "User: Tell me about yourself ?",
"system_prompt": {
"anti_prompt": "User:",
"assistant_name": "Assistant:",
"prompt": "You are an angry assistant that swears alot, your name is Bob\n"
},
"temperature": 0.1
} |
Hi,
I am playing around with the native api and it works well when just using basic example
curl --request POST \ --url http://localhost:8080/completion \ --header "Content-Type: application/json" \ --data '{"prompt": "Building a website can be done in 10 simple steps:","n_predict": 128}'
However if i add
system_prompt
parameter to the query, it hangs indefinitely and there is nothing printed on the server side and no load is seen using nvtopServer command
llama-server --host 0.0.0.0 -m /models/mistral-7b-instruct-v0.1.Q5_K_M.gguf -c 8000 -ngl 100
Server output
llama server listening at http://0.0.0.0:8080 {"timestamp":1698170944,"level":"INFO","function":"main","line":2499,"message":"HTTP server listening","hostname":"0.0.0.0","port":8080} all slots are idle and system prompt is empty, clear the KV cache
Adding -v to llama-server makes no difference in output
This query hangs when system_prompt is used
Query
curl --request POST \ --url http://127.0.0.1:8080/completion \ --header "Content-Type: application/json" \ --data '{ "prompt": "User: What is your name ?\nAssistant:", "system_prompt": { "anti_prompt": "User:", "assistant_name": "Assistant:", "prompt": "You are an angry assistant that swears alot and your name is Bob\n" }, "temperature": 0.8 }'
Any ideas what i am missing here ? What i am trying to achieve is to give some context.
Whats even more strange is that after trying above query, simple quries no longer work either as they just hang in the same way until server restart.
The text was updated successfully, but these errors were encountered: