Add Support for Llama-cpp-python Grammar #2548

Kaotic3 · 2024-01-22T11:19:51Z

Hi There,

Love this server, super fast and really one of the few that utilises the GPUs I am using to their full capacity.

The one problem I am having is that I use Grammar from Llama-cpp-python to control the output from the LLM and force it into a JSON format. Which I can parse.

I have tried without it, and the formatting is just so poor that the remedial work required makes any time saving from the faster server a wash, bearing in mind I am dealing with thousands of requests not just one or two.

It would be great if we could use grammar with vLLM and get back the responses we need.

Appreciate the consideration.

lzl12051 · 2024-02-01T02:57:01Z

I second this, Grammar is super useful when you need to process LLM's results in a deterministic way. Appreciate it.

hmellor · 2024-04-04T08:25:50Z

vLLM supports guided decoding via outlines #2819

simon-mo added the structured-output label Jan 23, 2024 — with Linear

hmellor closed this as completed Apr 4, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Support for Llama-cpp-python Grammar #2548

Add Support for Llama-cpp-python Grammar #2548

Kaotic3 commented Jan 22, 2024

lzl12051 commented Feb 1, 2024

hmellor commented Apr 4, 2024

Add Support for Llama-cpp-python Grammar #2548

Add Support for Llama-cpp-python Grammar #2548

Comments

Kaotic3 commented Jan 22, 2024

lzl12051 commented Feb 1, 2024

hmellor commented Apr 4, 2024