Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Misc] Validate grammar and fail early #11119

Merged
merged 7 commits into from
Dec 12, 2024

Conversation

comaniac
Copy link
Collaborator

@comaniac comaniac commented Dec 11, 2024

If the grammar is invalid, we now let xGrammar throw RuntimeError when compiling it. However, this happens in logits processor, so the exception is raised from model executor. Since we don't expect model executor to throw any exception now, the exception will crash the engine and kill the worker process.

This PR adds a validation to make sure the grammar is valid when constructing the GrammarConfig to solve this issue.

Note that there is another issue with the xgrammar backend that isn't addressed by this PR #11118

cc @mgoin

Copy link

👋 Hi! Thank you for contributing to the vLLM project.
Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can do one of these:

  • Add ready label to the PR
  • Enable auto-merge.

🚀

Signed-off-by: Cody Yu <hao.yu.cody@gmail.com>
@kouroshHakha
Copy link

Hey @comaniac,

1/ Let's add unittests for this. Let's make sure it doesn't diverge from this behavior later on.
2/ We should do a similar thing for json schema as well. Use the following extra_body and it will still kill the engine:
extra_body={"guided_json": {"type": "str"}}

@mgoin mgoin self-requested a review December 12, 2024 00:19
@comaniac
Copy link
Collaborator Author

I don't know where to add unit tests for xgrammar backend. It seems not much unit tests have been added for this component. @mgoin do you have any pointer or should we merge this PR first and make up unit tests later?

Per offline discussion with @mgoin, this PR also fixes the Lark-like issue. I also found another issue that crashes the engine with my fix:

  1. Send an invalid grammar. It failed with bad request. However, at this moment the tokenizer data is cached.
  2. Send a valid grammar. Since the tokenizer data is cached, we have encoded_vocab = None, which results in crash in get_compiler. This is because here is a hidden assumption that when encoded_vocab = None, the compiler must be initialized already, but this is no longer guaranteed.

The fix in this PR is to make sure encoded_vocab would never be None. This shouldn't hurt performance because the tokenizer data is cached anyways.

@comaniac comaniac linked an issue Dec 12, 2024 that may be closed by this pull request
1 task
Comment on lines -135 to -143
if tokenizer_hash in TokenizerDataCache._cache:
encoded_vocab = None
stop_token_ids = None
backend_str = None
else:
tokenizer_data = TokenizerDataCache.get_tokenizer_data(tokenizer)
encoded_vocab = tokenizer_data.encoded_vocab
stop_token_ids = tokenizer_data.stop_token_ids
backend_str = tokenizer_data.backend_str
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

encoded_vocab cannot be None anymore because the compiler may not be initialized even the tokenizer data is cached if the grammar is invalid. This change shouldn't hurt performance because the tokenizer data is cached anyways.

Signed-off-by: Cody Yu <hao.yu.cody@gmail.com>
Signed-off-by: Cody Yu <hao.yu.cody@gmail.com>
Signed-off-by: Cody Yu <hao.yu.cody@gmail.com>
Signed-off-by: Cody Yu <hao.yu.cody@gmail.com>
Signed-off-by: Cody Yu <hao.yu.cody@gmail.com>
Signed-off-by: Cody Yu <hao.yu.cody@gmail.com>
Comment on lines +29 to +33
# Look for GBNF rule definition
if '::=' in line:
return False

# Look for Lark-specific features
if any(pattern in line for pattern in ['?start:', '|', '~']):
return True

return False
return True

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would fix my issue!

Copy link
Member

@mgoin mgoin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@mgoin mgoin added the ready ONLY add when PR is ready to merge/full CI is needed label Dec 12, 2024
@mgoin mgoin enabled auto-merge (squash) December 12, 2024 17:05
@mgoin mgoin merged commit 2c97eca into vllm-project:main Dec 12, 2024
63 checks passed
sleepwalker2017 pushed a commit to sleepwalker2017/vllm that referenced this pull request Dec 13, 2024
@comaniac comaniac deleted the validate_grammar branch December 20, 2024 22:02
ZenPuzzle pushed a commit to ZenPuzzle/vllm that referenced this pull request Dec 24, 2024
BKitor pushed a commit to BKitor/vllm that referenced this pull request Dec 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ready ONLY add when PR is ready to merge/full CI is needed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Bug]: grammar_is_likely_lark doesn't work correctly
4 participants