Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bugfix] Fix beam search logits processor #3454

Conversation

maximzubkov
Copy link
Contributor

This PR addresses the bug described in the following issue. Please, refer to the issue to get more details on the bug.

id(lp): lp
for lp in self.logits_processors
}
for lp in self.logits_processors if not hasattr(lp[0], "fsm")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This affects the following line and as described in the following issue and partially addressed by the following PR. Unfortunately, in this case, the copy of logits_processors is inevitable and seems like it indeed does slow down the inference. However, I'm using a relatively old GPU (4x NVIDIA RTX A4000, 16Gb) so I would need someone with better hardware to benchmark the speed. Refer to the issue for the code snippets to reproduce the tests

@@ -99,8 +120,16 @@ def __call__(self, input_ids: List[int],

return scores

def __deepcopy__(self, memo):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that by implementing the __deepcopy__ method here it is now possible to copy.deepcopy an entire BaseGuidedLogitsProcessor object. This lines are needed since a naive copy of outlines.fsm.guide.CFGGuide fails:

Code to reproduce:

import copy
from outlines.fsm.guide import CFGGuide
from transformers import AutoTokenizer

model = "microsoft/phi_1"

tokenizer = AutoTokenizer.from_pretrained(model)

simple_sql_grammar = """
start: select_statement

select_statement: "SELECT" column "from" table "where" condition

column: "col_1" | "col_2"
table: "table_1" | "table_2"
condition: column "=" number

number: "1" | "2"
"""

fsm = CFGGuide(simple_sql_grammar, tokenizer)
copy.deepcopy(fsm)

Error message:

Traceback (most recent call last):
  File "/Users/maximzubkov/tmp.py", line 22, in <module>
    copy.deepcopy(fsm)
  File "/Users/maximzubkov/.pyenv/versions/3.10.2/lib/python3.10/copy.py", line 172, in deepcopy
    y = _reconstruct(x, memo, *rv)
  File "/Users/maximzubkov/.pyenv/versions/3.10.2/lib/python3.10/copy.py", line 271, in _reconstruct
    state = deepcopy(state, memo)
  File "/Users/maximzubkov/.pyenv/versions/3.10.2/lib/python3.10/copy.py", line 146, in deepcopy
    y = copier(x, memo)
  File "/Users/maximzubkov/.pyenv/versions/3.10.2/lib/python3.10/copy.py", line 231, in _deepcopy_dict
    y[deepcopy(key, memo)] = deepcopy(value, memo)
  File "/Users/maximzubkov/.pyenv/versions/3.10.2/lib/python3.10/copy.py", line 172, in deepcopy
    y = _reconstruct(x, memo, *rv)
  File "/Users/maximzubkov/.pyenv/versions/3.10.2/lib/python3.10/copy.py", line 271, in _reconstruct
    state = deepcopy(state, memo)
  File "/Users/maximzubkov/.pyenv/versions/3.10.2/lib/python3.10/copy.py", line 146, in deepcopy
    y = copier(x, memo)
  File "/Users/maximzubkov/.pyenv/versions/3.10.2/lib/python3.10/copy.py", line 231, in _deepcopy_dict
    y[deepcopy(key, memo)] = deepcopy(value, memo)
  File "/Users/maximzubkov/.pyenv/versions/3.10.2/lib/python3.10/copy.py", line 172, in deepcopy
    y = _reconstruct(x, memo, *rv)
  File "/Users/maximzubkov/.pyenv/versions/3.10.2/lib/python3.10/copy.py", line 271, in _reconstruct
    state = deepcopy(state, memo)
  File "/Users/maximzubkov/.pyenv/versions/3.10.2/lib/python3.10/copy.py", line 146, in deepcopy
    y = copier(x, memo)
  File "/Users/maximzubkov/.pyenv/versions/3.10.2/lib/python3.10/copy.py", line 231, in _deepcopy_dict
    y[deepcopy(key, memo)] = deepcopy(value, memo)
  File "/Users/maximzubkov/.pyenv/versions/3.10.2/lib/python3.10/copy.py", line 172, in deepcopy
    y = _reconstruct(x, memo, *rv)
  File "/Users/maximzubkov/.pyenv/versions/3.10.2/lib/python3.10/copy.py", line 271, in _reconstruct
    state = deepcopy(state, memo)
  File "/Users/maximzubkov/.pyenv/versions/3.10.2/lib/python3.10/copy.py", line 146, in deepcopy
    y = copier(x, memo)
  File "/Users/maximzubkov/.pyenv/versions/3.10.2/lib/python3.10/copy.py", line 231, in _deepcopy_dict
    y[deepcopy(key, memo)] = deepcopy(value, memo)
  File "/Users/maximzubkov/.pyenv/versions/3.10.2/lib/python3.10/copy.py", line 146, in deepcopy
    y = copier(x, memo)
  File "/Users/maximzubkov/.pyenv/versions/3.10.2/lib/python3.10/copy.py", line 231, in _deepcopy_dict
    y[deepcopy(key, memo)] = deepcopy(value, memo)
  File "/Users/maximzubkov/.pyenv/versions/3.10.2/lib/python3.10/copy.py", line 172, in deepcopy
    y = _reconstruct(x, memo, *rv)
  File "/Users/maximzubkov/.pyenv/versions/3.10.2/lib/python3.10/copy.py", line 271, in _reconstruct
    state = deepcopy(state, memo)
  File "/Users/maximzubkov/.pyenv/versions/3.10.2/lib/python3.10/copy.py", line 146, in deepcopy
    y = copier(x, memo)
  File "/Users/maximzubkov/.pyenv/versions/3.10.2/lib/python3.10/copy.py", line 231, in _deepcopy_dict
    y[deepcopy(key, memo)] = deepcopy(value, memo)
  File "/Users/maximzubkov/.pyenv/versions/3.10.2/lib/python3.10/copy.py", line 161, in deepcopy
    rv = reductor(4)
TypeError: cannot pickle 'module' object

Copy link
Contributor Author

@maximzubkov maximzubkov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left also some comments and suggestions

@maximzubkov
Copy link
Contributor Author

I was quite surprised to see that so many tests in CI failed, so I tried to reproduce them locally on my machine. I attempted to reproduce the error I got in CI with test tests/entrypoints/test_openai_server.py::test_guided_json_completion:

E           jsonschema.exceptions.ValidationError: 'work history' is a required property
E           
E           Failed validating 'required' in schema:
E               {'properties': {'age': {'type': 'integer'},
E                               'name': {'type': 'string'},
E                               'skills': {'items': {'maxLength': 10, 'type': 'string'},
E                                          'minItems': 3,
E                                          'type': 'array'},
E                               'work history': {'items': {'properties': {'company': {'type': 'string'},
E                                                                         'duration': {'type': 'string'},
E                                                                         'position': {'type': 'string'}},
E                                                          'required': ['company',
E                                                                       'position'],
E                                                          'type': 'object'},
E                                                'type': 'array'}},
E                'required': ['name', 'age', 'skills', 'work history'],
E                'type': 'object'}
E           
E           On instance:
E               {'age': 25,
E                'name': 'John Doe',
E                'skills': ['JavaScript', 'React', 'CSS'],
E                'workHistory': [{'company': 'Company A',
E                                 'duration': '2 years',
E                                 'position': 'Developer'},
E                                {'company': 'Company B',
E                                 'duration': '3 years',
E                                 'position': 'Team Lead'}]}

The bug here is with work history that for some reason was renamed to workHistory in the camel case. Unfortunately, I only have access to NVIDIA RTX A4000 of 16Gb so I was not able to run the tests with HuggingFaceH4/zephyr-7b-beta. Instead, I used microsoft/phi_1 (tho it might be a more suitable model for JSON-formated outputs), and here is a tiny snipped of code that I created based on the above test:

from transformers import AutoTokenizer
from vllm import LLM, SamplingParams
from vllm.model_executor.guided_logits_processors import JSONLogitsProcessor


model = "microsoft/phi_1"
tokenizer = AutoTokenizer.from_pretrained(model)

schema = {
    "type": "object",
    "properties": {
        "name": {
            "type": "string"
        },
        "age": {
            "type": "integer"
        },
        "skills": {
            "type": "array",
            "items": {
                "type": "string",
                "maxLength": 10
            },
            "minItems": 3
        },
        "work history": {
            "type": "array",
            "items": {
                "type": "object",
                "properties": {
                    "company": {
                        "type": "string"
                    },
                    "duration": {
                        "type": "string"
                    },
                    "position": {
                        "type": "string"
                    }
                },
                "required": ["company", "position"]
            }
        }
    },
    "required": ["name", "age", "skills", "work history"]
}

prompt = f"Give an example JSON for an employee profile that fits this schema: {schema}"

sampling_params = SamplingParams(
        temperature=1.0,
        n=3,
        max_tokens=500,
        logits_processors=[JSONLogitsProcessor(schema, tokenizer)]
)
llm = LLM(model=model, dtype="auto")
outputs = llm.generate([prompt], sampling_params)
print([
        output_.text for output_ in outputs[0].outputs
])

And my local outputs were:

[
     '{"name": "John Doe", "age": 30, "skills": ["Python", "Java", "C++"], "work history": [{"company": "Apple", "duration": "2h", "position": "Developer"}, {"company": "Microsoft", "duration": "4h", "position": "Manager"}]}', 
     '{"name": "Sarah", "age": 35, "skills": ["Python", "Data Lab", "Machine Al"],"work history": [{"company": "Apple", "duration": "7 days", "position": "Actor"}]}', 
     '{"name": "John Doe", "age": 30, "skills": ["Python", "Java", "JavaScript"], "work history": [{"company": "Acme", "duration": "6 months", "position": "Manager"}, {"company": "ACM", "duration": "7 days", "position": "Associate"}, {"company": "LLC", "duration": "1 month", "position": "AM Analyst"}]}'
]

Here for some reason, everything is fine, and work history is formatted as expected. So at that point I definitely need a review, calling for help @simon-mo

@simon-mo simon-mo self-requested a review March 17, 2024 22:32
@simon-mo simon-mo self-assigned this Mar 17, 2024
Copy link
Collaborator

@simon-mo simon-mo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry about the delay. I think changing the model to phi is fine but you need to do some work to move the guided decoding related tests fromtest_openai_server.py to a separate file because existing tests rely on LoRA models on Zephyr.

@@ -694,6 +694,7 @@ async def test_guided_grammar(server, client: openai.AsyncOpenAI):
prompt=("Generate a sql state that select col_1 from "
"table_1 where it is equals to 1"),
temperature=1.0,
n=3,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since n = 3 here let's also verify all the outputs, you might also need to add use_beam_search to extra_body

Copy link

This pull request has been automatically marked as stale because it has not had any activity within 90 days. It will be automatically closed if no further activity occurs within 30 days. Leave a comment if you feel this pull request should remain open. Thank you!

@github-actions github-actions bot added the stale label Oct 29, 2024
Copy link

mergify bot commented Oct 29, 2024

This pull request has merge conflicts that must be resolved before it can be
merged. @maximzubkov please rebase it. https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants