-
Notifications
You must be signed in to change notification settings - Fork 28.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix torch.fx
symbolic tracing for LLama
#30047
Conversation
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for fixing!
Should work! Let's fix other models as well
@@ -987,7 +987,9 @@ def forward( | |||
if position_ids is None: | |||
position_ids = cache_position.unsqueeze(0) | |||
|
|||
causal_mask = self._update_causal_mask(attention_mask, inputs_embeds, cache_position) | |||
causal_mask = self._update_causal_mask( | |||
attention_mask, inputs_embeds, cache_position, past_seen_tokens + inputs_embeds.shape[1] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess passing cache_position[-1]
doesn't work? The only issue is that past_seen_tokens
s gonna be deprecated in favor of cache positions, but that works since with static cache we use the max positions embeddings
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Last thing, can you run the slow tests RUN_SLOW=1 pytest tests/models/llama
on cuda?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
====================================== short test summary info =======================================
SKIPPED [1] ../miniconda3/envs/py39/lib/python3.9/unittest/case.py:117: TODO @gante fix this for Llama
SKIPPED [1] tests/test_pipeline_mixin.py:327: LlamaModelTest::test_pipeline_audio_classification is skipped: `audio-classification` is not in `self.pipeline_model_mapping` for `LlamaModelTest`.
SKIPPED [1] tests/test_pipeline_mixin.py:331: LlamaModelTest::test_pipeline_automatic_speech_recognition is skipped: `automatic-speech-recognition` is not in `self.pipeline_model_mapping` for `LlamaModelTest`.
SKIPPED [1] tests/test_pipeline_mixin.py:335: LlamaModelTest::test_pipeline_conversational is skipped: `conversational` is not in `self.pipeline_model_mapping` for `LlamaModelTest`.
SKIPPED [1] tests/test_pipeline_mixin.py:339: LlamaModelTest::test_pipeline_depth_estimation is skipped: `depth-estimation` is not in `self.pipeline_model_mapping` for `LlamaModelTest`.
SKIPPED [1] tests/test_pipeline_mixin.py:346: test requires PyTesseract
SKIPPED [1] tests/test_pipeline_mixin.py:357: LlamaModelTest::test_pipeline_fill_mask is skipped: `fill-mask` is not in `self.pipeline_model_mapping` for `LlamaModelTest`.
SKIPPED [1] tests/test_pipeline_mixin.py:361: LlamaModelTest::test_pipeline_image_classification is skipped: `image-classification` is not in `self.pipeline_model_mapping` for `LlamaModelTest`.
SKIPPED [1] tests/test_pipeline_mixin.py:379: LlamaModelTest::test_pipeline_image_feature_extraction is skipped: `image-feature-extraction` is not in `self.pipeline_model_mapping` for `LlamaModelTest`.
SKIPPED [1] tests/test_pipeline_mixin.py:367: LlamaModelTest::test_pipeline_image_segmentation is skipped: `image-segmentation` is not in `self.pipeline_model_mapping` for `LlamaModelTest`.
SKIPPED [1] tests/test_pipeline_mixin.py:374: LlamaModelTest::test_pipeline_image_to_text is skipped: `image-to-text` is not in `self.pipeline_model_mapping` for `LlamaModelTest`.
SKIPPED [1] tests/test_pipeline_mixin.py:386: `run_pipeline_test` is currently not implemented.
SKIPPED [1] tests/test_pipeline_mixin.py:393: LlamaModelTest::test_pipeline_object_detection is skipped: `object-detection` is not in `self.pipeline_model_mapping` for `LlamaModelTest`.
SKIPPED [1] tests/test_pipeline_mixin.py:404: LlamaModelTest::test_pipeline_summarization is skipped: `summarization` is not in `self.pipeline_model_mapping` for `LlamaModelTest`.
SKIPPED [1] tests/test_pipeline_mixin.py:408: LlamaModelTest::test_pipeline_table_question_answering is skipped: `table-question-answering` is not in `self.pipeline_model_mapping` for `LlamaModelTest`.
SKIPPED [1] tests/test_pipeline_mixin.py:412: LlamaModelTest::test_pipeline_text2text_generation is skipped: `text2text-generation` is not in `self.pipeline_model_mapping` for `LlamaModelTest`.
SKIPPED [1] tests/test_pipeline_mixin.py:425: LlamaModelTest::test_pipeline_text_to_audio is skipped: `text-to-audio` is not in `self.pipeline_model_mapping` for `LlamaModelTest`.
SKIPPED [1] tests/test_pipeline_mixin.py:430: LlamaModelTest::test_pipeline_token_classification is skipped: `token-classification` is not in `self.pipeline_model_mapping` for `LlamaModelTest`.
SKIPPED [1] tests/test_pipeline_mixin.py:434: LlamaModelTest::test_pipeline_translation is skipped: `translation` is not in `self.pipeline_model_mapping` for `LlamaModelTest`.
SKIPPED [1] tests/test_pipeline_mixin.py:438: LlamaModelTest::test_pipeline_video_classification is skipped: `video-classification` is not in `self.pipeline_model_mapping` for `LlamaModelTest`.
SKIPPED [1] tests/test_pipeline_mixin.py:445: LlamaModelTest::test_pipeline_visual_question_answering is skipped: `visual-question-answering` is not in `self.pipeline_model_mapping` for `LlamaModelTest`.
SKIPPED [1] tests/test_pipeline_mixin.py:455: LlamaModelTest::test_pipeline_zero_shot_audio_classification is skipped: `zero-shot-audio-classification` is not in `self.pipeline_model_mapping` for `LlamaModelTest`.
SKIPPED [1] tests/test_pipeline_mixin.py:460: LlamaModelTest::test_pipeline_zero_shot_image_classification is skipped: `zero-shot-image-classification` is not in `self.pipeline_model_mapping` for `LlamaModelTest`.
SKIPPED [1] tests/test_pipeline_mixin.py:465: LlamaModelTest::test_pipeline_zero_shot_object_detection is skipped: `zero-shot-object-detection` is not in `self.pipeline_model_mapping` for `LlamaModelTest`.
SKIPPED [1] tests/models/llama/test_modeling_llama.py:371: Llama buffers include complex numbers, which breaks this test
SKIPPED [1] tests/models/llama/test_modeling_llama.py:657: Model is curently gated
SKIPPED [1] tests/models/llama/test_modeling_llama.py:615: Logits are not exactly the same, once we fix the instabalities somehow, will update!
SKIPPED [1] tests/models/llama/test_modeling_llama.py:628: Logits are not exactly the same, once we fix the instabalities somehow, will update!
SKIPPED [1] tests/models/llama/test_modeling_llama.py:641: Logits are not exactly the same, once we fix the instabalities somehow, will update! Also it is gonna be a `too_slow` test
SKIPPED [1] tests/models/llama/test_modeling_llama.py:602: Logits are not exactly the same, once we fix the instabalities somehow, will update!
================= 1 failed, 120 passed, 30 skipped, 23 warnings in 143.75s (0:02:23) =================
LGTM thanks for taking the time to fix it
What does this PR do?
As per title.