server : fix templates for llama2, llama3 and zephyr in new UI #8196

mgroeber9110 · 2024-06-28T18:35:11Z

This change makes some adjustments to the pre-defined chat templates in the new server UI, which in my interpretation bring them in line to the recommended versions. I have done this for the following templates:

Llama2
Llama3
Zephyr

as these are the models I have some experience with. There might be further similar discrepancies for other templates, but I have not checked those. For the Llama models, I have also removed the start-of-text tokens at the beginning, as they are automatically added by the server, and their duplication leads to a warning message.

It would of course be nicer to connect this to the llama_chat_apply_template() implementation to have only one set of templates to maintain and test in the codebase, for example by making the server UI use the chat endpoint rather than the completion one. Is anyone already working on this?

I have read the contributing guidelines
Self-reported review complexity:
- Low
- Medium
- High

…lates

mgroeber9110 · 2024-06-30T09:53:02Z

Just noticed that there is plenty of discussion about the future direction of templating in the server in #4216.

ngxson · 2024-06-30T18:26:56Z

I'm not familiar with the template system in JS, but seems like there are still some issues. For example, llama 3 never has new lines before EOT token (<|eot_id|>\n), or whether historyTemplate of llama2 should also include |INST] or not, etc.

It would of course be nicer to connect this to the llama_chat_apply_template() implementation to have only one set of templates to maintain and test in the codebase, for example by making the server UI use the chat endpoint rather than the completion one. Is anyone already working on this?

Yes it would be nice to do so. Currently there is no task to add this such feature. Probably we can extend /props endpoint to return a formatted chat example (inspired by llama_chat_format_example), then JS code will parse it to get the template.

mgroeber9110 · 2024-07-03T20:43:08Z

@ngxson Thanks - I have removed the extra newlines in llama3.

As for the llama2 template, the [INST] is already added through the userMsgPrefix field in the template description (not visible in the diff). Support for this template is still not pretty, because the [INST] ... [/INST] are shown in the Chat UI around the request, rather than being stripped off.

Making the chat example available through the /props endpoint sounds like a good idea. Should I create an Issue for this? As far as I can see, parsing the example to find the "looping" part in a longer conversation is not always trivial, but at least this would give the client code some options to check/select templates automatically. At the very least, it might be possible to bring up a warning if the chosen template does not match the example in the model character by character.

mgroeber9110 · 2024-07-26T13:27:42Z

I am closing this PR to be able to update my repo without causing too much "noise" (as I accidentally made it from "master"), and to wait for the outcome of the investigation for #8694.

mgroeber9110 · 2024-07-26T13:28:16Z

closing for now

mgroeber9110 added 2 commits June 27, 2024 22:57

server : fix templates for llama2, llama3 and zephyr

f50bd5c

server : remove initial start-of-text tokens from llama2, llama3 temp…

4be8de1

…lates

github-actions bot added examples server labels Jun 28, 2024

mofosyne added the Review Complexity : Low Trivial changes to code that most beginner devs (or those who want a break) can tackle. e.g. UI fix label Jun 29, 2024

server : remove extra \n after <|eot_id|> in llama3 template

2727b02

mgroeber9110 mentioned this pull request Jul 25, 2024

Feature Request: server : make chat_example available through /props endpoint #8694

Closed

4 tasks

mgroeber9110 closed this Jul 26, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

server : fix templates for llama2, llama3 and zephyr in new UI #8196

server : fix templates for llama2, llama3 and zephyr in new UI #8196

mgroeber9110 commented Jun 28, 2024

mgroeber9110 commented Jun 30, 2024

ngxson commented Jun 30, 2024

mgroeber9110 commented Jul 3, 2024

mgroeber9110 commented Jul 26, 2024

mgroeber9110 commented Jul 26, 2024

server : fix templates for llama2, llama3 and zephyr in new UI #8196

server : fix templates for llama2, llama3 and zephyr in new UI #8196

Conversation

mgroeber9110 commented Jun 28, 2024

mgroeber9110 commented Jun 30, 2024

ngxson commented Jun 30, 2024

mgroeber9110 commented Jul 3, 2024

mgroeber9110 commented Jul 26, 2024

mgroeber9110 commented Jul 26, 2024