/props endpoint: provide context size through default_generation_settings #1237

kallewoof · 2024-11-25T13:49:24Z

It seems like a good idea to be able to directly pull and use the context size from the backend, rather than relying on users to manually getting both synced up all the time.

This adds the default_generation_settings dictionary with a single (for now) n_ctx entry to the /props endpoint response, which allows clients to update their context size to match the max context size used in koboldcpp.

It may be useful to include other default generation settings as well, but I am keeping it minimal for now.

…ings

LostRuins · 2024-11-25T15:23:57Z

Does this adhere to the format for the /props endpoint set by llama.cpp?

Because otherwise you can already use the koboldcpp endpoint /api/extra/true_max_context_length and /api/latest/config/max_context_length

If nobody else is using it, I would prefer not to create redundant endpoints.
https://lite.koboldai.net/koboldcpp_api#/api%2Fv1/get_api_v1_config_max_context_length

kallewoof · 2024-11-25T23:39:04Z

Does this adhere to the format for the /props endpoint set by llama.cpp?

Yes. Adding this field to the pre-existing /props endpoint would make them compatible in terms of getting the backend max context size.

Because otherwise you can already use the koboldcpp endpoint /api/extra/true_max_context_length and /api/latest/config/max_context_length

Didn't realize that.

If nobody else is using it, I would prefer not to create redundant endpoints. https://lite.koboldai.net/koboldcpp_api#/api%2Fv1/get_api_v1_config_max_context_length

This is actually an existing endpoint (which returns chat_template and total_slots right now). This in a way adds the missing third field that llama.cpp provides called default_generation_settings, but limits its contents (for now) to only providing the context size.
Using the pre-existing endpoints for koboldcpp would achieve the goal, so I am neutral about merging this.
The benefit of merging would be to make the /props endpoint complete in terms of "root" fields, and would make it completely transparent for clients wishing to fetch the max context size (if we disregard version handling).

LostRuins

alright then

/props endpoint: provide context size through default_generation_sett…

72a9a3b

…ings

kallewoof mentioned this pull request Nov 25, 2024

feature: allow auto-use of max context size given by backend SillyTavern/SillyTavern#3112

Merged

3 tasks

LostRuins approved these changes Nov 26, 2024

View reviewed changes

LostRuins merged commit fd320f6 into LostRuins:concedo_experimental Nov 26, 2024

kallewoof deleted the 202411-props-gensettings branch November 26, 2024 08:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

/props endpoint: provide context size through default_generation_settings #1237

/props endpoint: provide context size through default_generation_settings #1237

kallewoof commented Nov 25, 2024

LostRuins commented Nov 25, 2024 •

edited

Loading

kallewoof commented Nov 25, 2024 •

edited

Loading

LostRuins left a comment

/props endpoint: provide context size through default_generation_settings #1237

/props endpoint: provide context size through default_generation_settings #1237

Conversation

kallewoof commented Nov 25, 2024

LostRuins commented Nov 25, 2024 • edited Loading

kallewoof commented Nov 25, 2024 • edited Loading

LostRuins left a comment

Choose a reason for hiding this comment

LostRuins commented Nov 25, 2024 •

edited

Loading

kallewoof commented Nov 25, 2024 •

edited

Loading