Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature]: Only apply Guided/Structured grammar after reasoning steps in Reasoning models #12619

Open
1 task done
cksac opened this issue Jan 31, 2025 · 3 comments
Open
1 task done

Comments

@cksac
Copy link

cksac commented Jan 31, 2025

🚀 The feature, motivation and pitch

Only apply Guided/Structured grammar only in the answer for reasoning model. i.e. for DeepSeek R1 only enforce grammar inside <answer></answer> or after </think>
This would make Reasoning models more useful in agent workflow expecting structured output.

Alternatives

No response

Additional context

No response

Before submitting a new issue...

  • Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.
@gaocegege
Copy link
Contributor

Related to #11908

@gaocegege
Copy link
Contributor

gaocegege commented Feb 8, 2025

I have a PoC tested with deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B, which introduces a checker in the logits processor to skip xgrammar schema check:

class XGrammarLogitsProcessor:
    """Wrapper class to support pickle protocol"""
    config: GrammarConfig

    ctx: xgr.CompiledGrammar | None = None
    token_bitmask: torch.Tensor = None  # type: ignore[assignment]
    matchers: list[xgr.GrammarMatcher] = field(default_factory=list)
    batch_size: int = field(default=1)
    prefilled: bool = field(default=False)

    def __call__(self, input_ids: list[int],
                 scores: torch.Tensor) -> torch.Tensor:
+        if not reasoning_end(input_ids):
+           return scores
        if self.ctx is None:
            self._ensure_ctx()
def reasoning_end(input_ids: list[int]) -> bool:
    """Check if the input_ids contain the end of reasoning token."""
    # Hard coded endthink token id </think>
    endthink_token_id = 151649

    if endthink_token_id in input_ids:
        return True

I can generalize it to support all R1 models and structured engines if you think this approach is effective.

full code:

gaocegege@9134347

@gaocegege
Copy link
Contributor

Having a draft PR, #12955

Please let me know if the approach works for you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants