Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC]: Implement Structured Output support for V1 engine #11908

Open
1 task done
russellb opened this issue Jan 9, 2025 · 1 comment
Open
1 task done

[RFC]: Implement Structured Output support for V1 engine #11908

russellb opened this issue Jan 9, 2025 · 1 comment

Comments

@russellb
Copy link
Member

russellb commented Jan 9, 2025

Motivation.

Structured Output is supported in v0, but not yet in v1. One reason for the delay is there have been performance challenges with the integration in v0, and we'd like to rethink the integration approach. We would also like to account for supporting additional techniques, jump decoding in particular, in the future.

The document below covers the proposed integration of the Structured Output functionality in V1 of the vLLM engine.

Proposed Change.

A draft proposal can be found in this google doc: https://docs.google.com/document/d/1H6m_Y3FLJ1FYGCmjXdZzoJv-JCDSxnKuSY2XiAj-c6c/edit?tab=t.0

This content will eventually be moved into a PR as an addition to the design docs section of the vllm docs.

Related issue for closing xgrammar feature gaps: #12131

Feedback Period.

No response

CC List.

@mgoin @aarnphm @markmc @simon-mo @xuechendi @WoosukKwon

Any Other Things.

No response

Before submitting a new issue...

  • Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.
@aarnphm
Copy link
Contributor

aarnphm commented Jan 26, 2025

wip #12388 for initial support in v1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants