Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

llama: preserve field order in user-defined JSON schemas #8002

Merged
merged 1 commit into from
Dec 11, 2024

Conversation

bmizerany
Copy link
Contributor

@bmizerany bmizerany commented Dec 9, 2024

llama: preserve field order in user-defined JSON schemas

Previously we decoded and re-encoded JSON schemas during validation,
which served no purpose since json.RawMessage already validates JSON
syntax. Worse, the re-encoding lost field ordering from the original
schema, which affects inference quality during step-by-step reasoning.

While fixing this ordering issue by using json.RawMessage directly,
testing revealed that schema_to_grammar (from llama.cpp) also fails to
preserve field order during grammar generation. This appears to be the
root cause of inference degradation.

This change prevents us from mangling the user's original schema order,
but we still need to address the ordering issue in schema_to_grammar.
That will be a separate change.

Updates #7978

@bmizerany bmizerany force-pushed the bmizerany/issue7978 branch 3 times, most recently from 0d216f7 to 4b4fe47 Compare December 9, 2024 01:25
@iscy
Copy link
Contributor

iscy commented Dec 9, 2024

@bmizerany, llama.cpp's json-schema-to-grammar already ensures that the order is being preserved by using nlohmann::ordered_json. These PRs were opened in the past and kept the original key ordering while still relying on that facility:
#7588
#6658

@bmizerany bmizerany changed the title DO NOT MERGE: llama: preserve field order in user-defined JSON schemas llama: preserve field order in user-defined JSON schemas Dec 11, 2024
@bmizerany
Copy link
Contributor Author

bmizerany commented Dec 11, 2024

Skipping schema_to_grammer tests in prep for @ParthSareen's work on patching. Ready to merge.

@bmizerany
Copy link
Contributor Author

@iscy Thank you for pointing that out. This gets us closer. More updates incoming for a complete fix.

Previously we decoded and re-encoded JSON schemas during validation,
which served no purpose since json.RawMessage already validates JSON
syntax. Worse, the re-encoding lost field ordering from the original
schema, which affects inference quality during step-by-step reasoning.

While fixing this ordering issue by using json.RawMessage directly,
testing revealed that schema_to_grammar (from llama.cpp) also fails to
preserve field order during grammar generation. This appears to be the
root cause of inference degradation.

This change prevents us from mangling the user's original schema order,
but we still need to address the ordering issue in schema_to_grammar.
That will be a separate change.

Updates #7978
@bmizerany bmizerany merged commit 9039c82 into main Dec 11, 2024
16 checks passed
@bmizerany bmizerany deleted the bmizerany/issue7978 branch December 11, 2024 22:07
bmizerany added a commit that referenced this pull request Dec 17, 2024
The change in #8002 introduced a regression where the server rejected
request with the format was set and empty. This was a regression from
the previous version of the code it was solving other problems for.

Then, in #8127, the server was updated to allow the empty format, but
also reintroduced the regression where the server would silently fail
when the format was set, but invalid.

This commit fixes both regressions. The server does not reject the empty
format, but it does reject invalid formats. It also adds tests to help
us catch regressions in the future.

Also, the updated code provides a more detailed error message when a
client sends a non-empty, but invalid format, echoing the invalid format
in the response.
bmizerany added a commit that referenced this pull request Dec 17, 2024
Changes in #8002 introduced fixes for bugs with mangling JSON Schemas.
It also fixed a bug where the server would silently fail when clients
requested invalid formats. It also, unfortunately, introduced a bug
where the server would reject requests with an empty format, which
should be allowed.

The change in #8127 updated the code to allow the empty format, but also
reintroduced the regression where the server would silently fail when
the format was set, but invalid.

This commit fixes both regressions. The server does not reject the empty
format, but it does reject invalid formats. It also adds tests to help
us catch regressions in the future.

Also, the updated code provides a more detailed error message when a
client sends a non-empty, but invalid format, echoing the invalid format
in the response.
bmizerany added a commit that referenced this pull request Dec 17, 2024
Changes in #8002 introduced fixes for bugs with mangling JSON Schemas.
It also fixed a bug where the server would silently fail when clients
requested invalid formats. It also, unfortunately, introduced a bug
where the server would reject requests with an empty format, which
should be allowed.

The change in #8127 updated the code to allow the empty format, but also
reintroduced the regression where the server would silently fail when
the format was set, but invalid.

This commit fixes both regressions. The server does not reject the empty
format, but it does reject invalid formats. It also adds tests to help
us catch regressions in the future.

Also, the updated code provides a more detailed error message when a
client sends a non-empty, but invalid format, echoing the invalid format
in the response.

This commits also takes the opportunity to remove superfluous linter
checks.
bmizerany added a commit that referenced this pull request Dec 17, 2024
Changes in #8002 introduced fixes for bugs with mangling JSON Schemas.
It also fixed a bug where the server would silently fail when clients
requested invalid formats. It also, unfortunately, introduced a bug
where the server would reject requests with an empty format, which
should be allowed.

The change in #8127 updated the code to allow the empty format, but also
reintroduced the regression where the server would silently fail when
the format was set, but invalid.

This commit fixes both regressions. The server does not reject the empty
format, but it does reject invalid formats. It also adds tests to help
us catch regressions in the future.

Also, the updated code provides a more detailed error message when a
client sends a non-empty, but invalid format, echoing the invalid format
in the response.

This commits also takes the opportunity to remove superfluous linter
checks.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants