Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

QWEN2 Model generate failed #10352

Closed
kunger97 opened this issue Mar 8, 2024 · 1 comment
Closed

QWEN2 Model generate failed #10352

kunger97 opened this issue Mar 8, 2024 · 1 comment
Assignees

Comments

@kunger97
Copy link

kunger97 commented Mar 8, 2024

I'm using a model fine-tuned based on qwen2 (qwen1.5).
When I use bigdl to load the model and execute the generate method, python prompts an error
I'm run with a Intel(R) Data Center GPU Flex 170
model load via
AutoModelForCausalLM.from_pretrained(**model_name_or_path**, load_in_4bit=True, optimize_model=True, trust_remote_code=True, use_cache=True)
The following is the error message

2024-03-07 23:35:19 s084-n001 root[2367379] INFO intel_extension_for_pytorch auto imported
Loading checkpoint shards: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:00<00:00,  6.18it/s]
2024-03-07 23:35:20 s084-n001 bigdl.llm.transformers.utils[2367379] INFO Converting the current model to sym_int4 format......
Traceback (most recent call last):
  File "/home/u1024045/Sakura-13B-Galgame/server.py", line 101, in <module>
    state.get_model().check_model_by_magic()
  File "/home/u1024045/Sakura-13B-Galgame/utils/model.py", line 261, in check_model_by_magic
    (prompt, ground_truth, output) = self.test_loaded()
  File "/home/u1024045/Sakura-13B-Galgame/utils/model.py", line 384, in test_loaded
    output = self.completion(prompt, generation_config, is_print_speed=False)
  File "/home/u1024045/Sakura-13B-Galgame/utils/model.py", line 351, in completion
    output = self.get_model_response(
  File "/home/u1024045/Sakura-13B-Galgame/utils/model.py", line 582, in get_model_response
    output, (input_tokens_len, new_tokens) = self.__general_model(model, tokenizer, prompt, model_version, generation_config)
  File "/home/u1024045/Sakura-13B-Galgame/utils/model.py", line 518, in __general_model
    generation = model.generate(**input_tokens.to(model.device), generation_config=generation_config)[0]
  File "/home/u1024045/intel/intelpython3/envs/SakuraLLM/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/home/u1024045/intel/intelpython3/envs/SakuraLLM/lib/python3.10/site-packages/transformers/generation/utils.py", line 1544, in generate
    return self.greedy_search(
  File "/home/u1024045/intel/intelpython3/envs/SakuraLLM/lib/python3.10/site-packages/transformers/generation/utils.py", line 2404, in greedy_search
    outputs = self(
  File "/home/u1024045/intel/intelpython3/envs/SakuraLLM/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/u1024045/intel/intelpython3/envs/SakuraLLM/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/u1024045/intel/intelpython3/envs/SakuraLLM/lib/python3.10/site-packages/transformers/models/qwen2/modeling_qwen2.py", line 1173, in forward
    outputs = self.model(
  File "/home/u1024045/intel/intelpython3/envs/SakuraLLM/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/u1024045/intel/intelpython3/envs/SakuraLLM/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/u1024045/intel/intelpython3/envs/SakuraLLM/lib/python3.10/site-packages/bigdl/llm/transformers/models/qwen2.py", line 83, in qwen2_model_forward
    return Qwen2Model.forward(
  File "/home/u1024045/intel/intelpython3/envs/SakuraLLM/lib/python3.10/site-packages/transformers/models/qwen2/modeling_qwen2.py", line 1058, in forward
    layer_outputs = decoder_layer(
  File "/home/u1024045/intel/intelpython3/envs/SakuraLLM/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/u1024045/intel/intelpython3/envs/SakuraLLM/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/u1024045/intel/intelpython3/envs/SakuraLLM/lib/python3.10/site-packages/transformers/models/qwen2/modeling_qwen2.py", line 773, in forward
    hidden_states, self_attn_weights, present_key_value = self.self_attn(
  File "/home/u1024045/intel/intelpython3/envs/SakuraLLM/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/u1024045/intel/intelpython3/envs/SakuraLLM/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/u1024045/intel/intelpython3/envs/SakuraLLM/lib/python3.10/site-packages/bigdl/llm/transformers/models/qwen2.py", line 111, in qwen2_attention_forward
    return forward_function(
  File "/home/u1024045/intel/intelpython3/envs/SakuraLLM/lib/python3.10/site-packages/bigdl/llm/transformers/models/qwen2.py", line 266, in qwen2_attention_forward_origin
    query_states, key_states, value_states = linear_q4_0.forward_qkv_bias(*args)
TypeError: forward_qkv_bias(): incompatible function arguments. The following argument types are supported:
    1. (arg0: torch.Tensor, arg1: torch.Tensor, arg2: torch.Tensor, arg3: torch.Tensor, arg4: torch.Tensor, arg5: torch.Tensor, arg6: torch.Tensor, arg7: torch.Tensor, arg8: torch.Tensor, arg9: torch.Tensor, arg10: int, arg11: int, arg12: int, arg13: int, arg14: float) -> List[torch.Tensor]

Invoked with: tensor([[-0.0327,  0.0158, -0.1568,  ...,  0.0144, -0.0628, -0.0083]],
       device='xpu:0'), Parameter containing:
Parameter(FP4Params([183, 166, 189,  ..., 158, 184, 158], device='xpu:0',
          dtype=torch.uint8)), Parameter containing:
Parameter(FP4Params([126, 121, 104,  ...,  32, 224,  29], device='xpu:0',
          dtype=torch.uint8)), Parameter containing:
Parameter(FP4Params([184, 123, 167,  ..., 156, 136, 157], device='xpu:0',
          dtype=torch.uint8)), Parameter containing:
tensor([0.3633, 0.1074, 0.2988,  ..., 0.2051, 0.4590, 0.4766], device='xpu:0'), Parameter containing:
tensor([ 2.1875, -0.1289,  2.1875,  ..., -0.7500,  0.4609,  0.3008],
       device='xpu:0'), Parameter containing:
tensor([ 0.0010, -0.0008, -0.0056,  ...,  0.0014,  0.0035,  0.0054],
       device='xpu:0'), tensor([[77]], device='xpu:0'), tensor([[[[ 2.0431e+00, -1.4865e-01,  2.1124e+00,  ..., -3.8184e-01,
           -6.4209e-01, -2.9419e-01],
          [ 1.8656e+00,  1.4346e+00,  2.2123e+00,  ..., -4.1650e-01,
           -4.8828e-01,  8.6065e-03],
          [-3.5576e-01,  2.3152e+00,  1.0865e+00,  ...,  1.1367e+00,
            3.5181e-01, -1.2378e-01],
          ...,
          [-3.1509e-01,  2.7034e-01, -1.7572e+00,  ...,  3.0609e-01,
           -6.8117e-03,  8.3856e-02],
          [ 1.9105e+00, -1.5395e+00, -4.7159e-01,  ...,  1.1367e+00,
            3.5192e-01, -1.2384e-01],
          [ 2.2773e+00, -2.3349e+00,  9.4505e-01,  ..., -4.9695e-01,
           -4.7217e-01,  4.2923e-02]],

         [[-4.4863e+00, -4.1611e+00, -2.8506e+00,  ...,  9.5654e-01,
            4.7998e-01,  8.7012e-01],
          [-5.8577e+00, -5.1420e+00, -3.4966e+00,  ...,  6.9018e-01,
            2.1069e-01,  2.2048e-01],
          [-2.3767e+00, -2.3367e+00, -2.2306e+00,  ..., -1.1499e-01,
           -1.1470e+00, -4.6289e-01],
          ...,
          [ 3.7523e+00,  3.0986e+00,  3.7908e+00,  ...,  3.1833e-01,
           -4.9533e-02,  3.5000e-01],
          [-2.3802e-01,  3.1429e+00,  1.9795e+00,  ..., -1.1497e-01,
           -1.1469e+00, -4.6292e-01],
          [-7.1263e+00,  4.4955e+00,  8.6655e-01,  ...,  3.9410e-01,
            5.5750e-01,  3.1442e-01]],

         [[ 1.9062e+00,  3.7963e+00, -4.0215e+00,  ..., -4.1357e-01,
            1.2943e+00,  4.7314e-01],
          [ 6.6447e+00,  6.1954e-01, -4.8572e+00,  ..., -4.1138e-01,
            7.7149e-01, -2.1167e-01],
          [ 4.4660e+00, -1.2994e+00, -2.8579e+00,  ...,  3.1664e-01,
            1.0637e+00, -6.0376e-01],
          ...,
          [-6.1853e+00, -4.1569e+00,  4.6099e+00,  ...,  1.3197e-02,
            1.1229e+00, -3.1289e-01],
          [-2.1928e+00, -1.2635e+00,  2.2731e+00,  ...,  3.1644e-01,
            1.0638e+00, -6.0382e-01],
          [ 5.7778e+00,  3.1767e+00,  2.0426e-01,  ..., -2.6244e-01,
            9.1605e-01,  1.6940e-02]],

         ...,

         [[ 3.9600e+00,  2.8210e-01, -1.0675e+00,  ..., -4.9976e-01,
           -2.1484e-02,  1.1929e+00],
          [ 3.0140e+00, -8.1568e-01, -1.5968e+00,  ..., -5.5273e-01,
            1.3086e-01,  1.0454e+00],
          [-9.1203e-01, -6.5867e-01,  6.2539e-02,  ..., -8.4720e-02,
            1.4961e+00,  6.9531e-01],
          ...,
          [-5.3842e-02, -7.3063e-02,  1.3192e+00,  ..., -5.6616e-01,
            5.1990e-01,  1.0360e+00],
          [ 2.4535e+00,  1.0733e+00, -2.7253e-01,  ..., -8.4855e-02,
            1.4960e+00,  6.9516e-01],
          [ 3.7691e+00,  1.1376e+00,  3.9883e-01,  ..., -2.9058e-01,
            1.6357e-01,  8.8401e-01]],

         [[ 3.8635e+00, -3.9037e+00,  2.9119e+00,  ...,  5.7861e-02,
            1.6768e+00,  7.6147e-01],
          [ 4.5738e+00, -3.0259e+00,  7.0145e-02,  ...,  2.8967e-01,
            1.3486e+00,  1.0062e+00],
          [ 2.2920e+00, -1.3148e+00, -1.4311e+00,  ...,  9.1382e-01,
            4.2627e-01,  1.2969e+00],
          ...,
          [-3.3959e+00,  3.6432e+00,  2.5065e-01,  ...,  1.4510e-01,
            1.3821e+00,  7.5865e-01],
          [ 9.4066e-01,  3.0501e+00,  2.1833e+00,  ...,  9.1395e-01,
            4.2628e-01,  1.2969e+00],
          [ 5.5800e+00,  1.1004e+00,  4.7316e+00,  ...,  1.8560e-01,
            1.4014e+00,  6.3424e-01]],

         [[ 7.2422e+00, -5.3955e+00, -1.9343e+00,  ..., -1.0391e+00,
            1.7407e-01, -1.9385e-01],
          [ 3.0751e+00, -2.1873e+00,  5.2074e-01,  ..., -1.3037e+00,
            3.3069e-01,  4.1509e-03],
          [-2.0127e+00,  9.2691e-01,  1.1490e+00,  ..., -4.1113e-01,
            1.5557e+00,  5.9327e-01],
          ...,
          [ 1.9074e+00,  4.7428e+00, -1.3664e+00,  ..., -7.2412e-01,
            2.5789e-01,  4.8126e-01],
          [ 4.6111e+00,  1.8736e+00, -1.6882e+00,  ..., -4.1089e-01,
            1.5556e+00,  5.9346e-01],
          [ 5.5311e+00, -2.6563e+00, -4.2301e+00,  ..., -1.2306e+00,
            7.1640e-01, -6.1282e-02]]]], device='xpu:0'), tensor([[[[-0.0376,  0.0189, -0.1018,  ..., -0.0313, -0.0268, -0.0342],
          [ 0.0568, -0.0038,  0.0842,  ..., -0.0377, -0.0361, -0.0119],
          [-0.0092, -0.0283,  0.2065,  ..., -0.0126,  0.0135,  0.0179],
          ...,
          [-0.0177,  0.0305, -0.0547,  ..., -0.0852, -0.0975, -0.0162],
          [-0.0092, -0.0283,  0.2065,  ..., -0.0126,  0.0135,  0.0179],
          [-0.0254,  0.0127,  0.1340,  ...,  0.0051, -0.0081, -0.0123]],

         [[-0.0400, -0.0324, -0.0119,  ..., -0.0671,  0.0174, -0.0078],
          [ 0.0176,  0.0997,  0.0815,  ..., -0.0421,  0.1174,  0.0441],
          [-0.0097,  0.0117, -0.0045,  ...,  0.0191, -0.0050,  0.0262],
          ...,
          [-0.0209, -0.1098, -0.0631,  ..., -0.0327,  0.0481, -0.0979],
          [-0.0097,  0.0117, -0.0045,  ...,  0.0191, -0.0050,  0.0262],
          [ 0.0281,  0.0071, -0.0461,  ...,  0.1033, -0.0406,  0.1726]],

         [[-0.0072,  0.0037,  0.0257,  ...,  0.0274, -0.0357, -0.0331],
          [-0.0332, -0.0330,  0.0232,  ...,  0.0486,  0.0833,  0.0244],
          [ 0.0245,  0.0118,  0.0107,  ..., -0.0147, -0.0086,  0.0038],
          ...,
          [-0.0618, -0.0211,  0.0953,  ...,  0.0168,  0.0722,  0.0070],
          [ 0.0245,  0.0118,  0.0107,  ..., -0.0147, -0.0086,  0.0038],
          [ 0.0436, -0.0067, -0.1610,  ...,  0.0737,  0.0258, -0.1229]],

         ...,

         [[ 0.0166,  0.0717,  0.0079,  ..., -0.0946,  0.0530,  0.0071],
          [-0.0191,  0.0579, -0.0060,  ..., -0.1937, -0.0456, -0.0103],
          [-0.0153, -0.0053, -0.0051,  ...,  0.1665,  0.0041, -0.0029],
          ...,
          [ 0.0246,  0.0405,  0.0425,  ..., -0.1638,  0.0307, -0.0687],
          [-0.0153, -0.0053, -0.0051,  ...,  0.1665,  0.0041, -0.0029],
          [ 0.0034, -0.0284,  0.0371,  ..., -0.1769, -0.0365, -0.0170]],

         [[-0.0300, -0.0509,  0.1106,  ..., -0.0741, -0.0235,  0.0092],
          [ 0.0853, -0.1142, -0.0838,  ..., -0.0432,  0.0277,  0.0601],
          [ 0.0030, -0.0116, -0.0189,  ...,  0.0057, -0.0192, -0.0052],
          ...,
          [-0.0305, -0.0355,  0.0478,  ..., -0.0867, -0.0077,  0.0170],
          [ 0.0030, -0.0116, -0.0189,  ...,  0.0057, -0.0192, -0.0052],
          [ 0.0288, -0.0515, -0.1591,  ..., -0.0258,  0.0598,  0.0529]],

         [[ 0.0452, -0.0138,  0.0550,  ...,  0.0255,  0.0305,  0.0431],
          [-0.0080, -0.0163, -0.0017,  ...,  0.0140, -0.1763, -0.0956],
          [ 0.0059, -0.0087, -0.0073,  ..., -0.0182,  0.0231, -0.0058],
          ...,
          [ 0.0072,  0.0329,  0.0046,  ..., -0.0449, -0.1269,  0.0778],
          [ 0.0059, -0.0087, -0.0073,  ..., -0.0182,  0.0231, -0.0058],
          [ 0.0658, -0.0063, -0.0608,  ..., -0.0423, -0.1793,  0.0204]]]],
       device='xpu:0'), 2, 77, 128, 1000000.0
@rnwang04 rnwang04 mentioned this issue Mar 8, 2024
1 task
@rnwang04 rnwang04 self-assigned this Mar 8, 2024
@rnwang04
Copy link
Contributor

rnwang04 commented Mar 8, 2024

Will fix this issue by #10356

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants