Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor(js/ai): refactored constrained generation into middleware, simplified json format #1612

Merged
merged 10 commits into from
Jan 27, 2025

Conversation

pavelgj
Copy link
Collaborator

@pavelgj pavelgj commented Jan 15, 2025

related to #1360

Constrained generation is not coupled with json format. JSON format declares that it depends on constrained generation. Models that natively support constrained generation will handle it automatically, and for models that don't support constrained generation we provide simulateConstrainedGeneration middleware (automatically injected when the model does not declare support for constrained generation).

If a developer wants to use simulated constrained generation explicitly they can specify the middleware on the generate call:

await generate(registry, {
  prompt: 'generate json',
  use: [
    simulateConstrainedGeneration(),
  ],
  output: {
    schema,
  },
});

instruction for the middleware are overridable:

await generate(registry, {
  prompt: 'generate json',
  use: [
    simulateConstrainedGeneration({
      instructionsRenderer: (schema) =>
        `must be json: ${schemaToTSInterface(schema)}`,
    }),
  ],
  output: {
    schema,
  },
});

Checklist (if applicable):

@pavelgj pavelgj requested a review from mbleigh January 15, 2025 20:02
Copy link
Collaborator

@mbleigh mbleigh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'm generally good with this, and I'm wondering if I made a mistake in how I was thinking about formats around streaming. I'm wondering if instead of format: 'array' it should have been {format: 'json', stream: 'incremental' | 'array'}.

I'm also wondering if format should accept mime types (e.g. application/json and the existing json, text are just shorthands that get translated into application/json and text/plain.

It would be a breaking change so if you think that sounds better we should try to work it in before 1.0

@pavelgj
Copy link
Collaborator Author

pavelgj commented Jan 26, 2025

I think I'm generally good with this, and I'm wondering if I made a mistake in how I was thinking about formats around streaming. I'm wondering if instead of format: 'array' it should have been {format: 'json', stream: 'incremental' | 'array'}.

I'm also wondering if format should accept mime types (e.g. application/json and the existing json, text are just shorthands that get translated into application/json and text/plain.

It would be a breaking change so if you think that sounds better we should try to work it in before 1.0

Honestly, I like the simplicity of current formats. They kind of just make sense and simple to use. It's not obvious what benefits splitting out stream config would offer.

@pavelgj pavelgj merged commit a3c6c28 into main Jan 27, 2025
4 checks passed
@pavelgj pavelgj deleted the pj/constrainedRefactor branch January 27, 2025 13:46
@dongyangli1226
Copy link

@pavelgj @mbleigh
Hi guys, thanks for this. We encountered similar issues mentioned in #1360 when upgrading to 1.0.5. Gemini models significantly hallucinate comparing to the behavior we see in 0.9.1. The fix you provided here does work. I wonder if you guys can help clarify this and when should we use this. It seems like we are following the standard of how Genkit and Dotprompt work based on documentation. I would expect declaring json schema in dotprompt will instruct the model properly. When the model hallucinates, it still generates json format correctly, however with very poor generated results.

Really appreciate your help. I think this might also affect others using dotprompt as it might seem like Gemini just doesn't follow instructions very well.

@dongyangli1226
Copy link

dongyangli1226 commented Feb 24, 2025

The requests with and without the middleware looks exactly the same: (However, without using the middleware, the model doesn't consistently follow prompt instructions).

request {
"messages": [
{
"role": "user",
"content": [
{
"text": ""
}
]
}
],
"config": {
"version": "gemini-2.0-pro-exp-02-05",
"temperature": 0.2,
},
"tools": [],
"output": {
"constrained": true,
"contentType": "application/json",
"format": "json",
"schema": {
"type": "object",
"properties": {
"insights": {
"type": "array",
"items": {
"type": "object",
"properties": {
"title": {
"type": "string"
},
"description": {
"type": "string"
},
"type": {
"type": "string",
"enum": [
"POTENTIAL_INCONSISTENCY",
"INCOMPLETE_ANSWER",
"ADMISSION"
]
},
"highlights": {
"type": "array",
"items": {
"type": "string"
}
},
"documentSnippets": {
"type": "array",
"items": {
"type": "object",
"properties": {
"documentId": {
"type": "string"
},
"text": {
"type": "string"
}
},
"required": [
"documentId",
"text"
],
"additionalProperties": true
}
},
"factIds": {
"type": "array",
"items": {
"type": "string"
}
},
"transcriptEntryIds": {
"type": "array",
"items": {
"type": "string"
}
},
"exampleQuestions": {
"type": "array",
"items": {
"type": "string"
}
}
},
"required": [
"title",
"description",
"type"
],
"additionalProperties": true
}
}
},
"required": [
"insights"
],
"additionalProperties": true,
"$schema": "http://json-schema.org/draft-07/schema#"
}
}
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

3 participants