-
Notifications
You must be signed in to change notification settings - Fork 180
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor(js/ai): refactored constrained generation into middleware, simplified json format #1612
Conversation
…implified json format
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I'm generally good with this, and I'm wondering if I made a mistake in how I was thinking about formats around streaming. I'm wondering if instead of format: 'array'
it should have been {format: 'json', stream: 'incremental' | 'array'}
.
I'm also wondering if format
should accept mime types (e.g. application/json
and the existing json
, text
are just shorthands that get translated into application/json
and text/plain
.
It would be a breaking change so if you think that sounds better we should try to work it in before 1.0
Honestly, I like the simplicity of current formats. They kind of just make sense and simple to use. It's not obvious what benefits splitting out stream config would offer. |
@pavelgj @mbleigh Really appreciate your help. I think this might also affect others using dotprompt as it might seem like Gemini just doesn't follow instructions very well. |
The requests with and without the middleware looks exactly the same: (However, without using the middleware, the model doesn't consistently follow prompt instructions).
|
related to #1360
Constrained generation is not coupled with
json
format. JSON format declares that it depends on constrained generation. Models that natively support constrained generation will handle it automatically, and for models that don't support constrained generation we providesimulateConstrainedGeneration
middleware (automatically injected when the model does not declare support for constrained generation).If a developer wants to use simulated constrained generation explicitly they can specify the middleware on the generate call:
instruction for the middleware are overridable:
Checklist (if applicable):