-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG FIX: Pydantic conversion logic for structured outputs is broken for models containing dictionaries #2003
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: dbczumar <corey.zumar@databricks.com>
Thanks for the PR @dbczumar but as far as I'm aware |
Thanks @RobertCraigie! It appears that Interestingly, I did find that passing a pydantic model with a dictionary throws an error for an entirely different reason, which seems to be a separate OpenAI bug:
=> At a glance, it appears that the OpenAI backend may be handling The current PR doesn't help with this second issue:
However, modifying the PR locally to remove
@RobertCraigie In summary, I think there are two issues here:
Happy to file a separate issue for (2). Let me know what your thoughts are here. |
@RobertCraigie Any updates / additional thoughts here? |
This part isn't a bug fwiw, the API only supports a specific subset of JSON schemas, so it is expected that valid JSON schemas cannot be used directly with strict structured outputs.
I was not aware that could work, let me get back to you. |
@RobertCraigie the backend behavior is a bug because the error message indicates that the schema is missing a property in “required”. However, the property cited as missing in the error message is clearly contained in the payload’s “required” field. At minimum, the error message is incorrect or misleading. I suspect there shouldn’t be an error at all. thanks for looking into this! :) |
Changes being requested
There's a bug in OpenAI's python client logic for translating pydantic models with dictionaries into structured outputs JSON schema definitions: dictionaries are always required to be empty in the resulting JSON schema, rendering the dictionary outputs significantly less useful since the LLM is never allowed to populate them
This PR fixes the issue and introduces test coverage.
Additional context & links
Fixes #2004