Add tests for qwen + allow uninitialized weights in Llama model #8552

jackzhxng · 2025-02-18T23:31:46Z

Summary

Add basic ci test for Qwen model. Requires some changes to llama/model.py to allow uninitialized (random) weights.

Test plan

Tested and passed locally
See if CI passes for the new test.

pytorch-bot · 2025-02-18T23:31:49Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/8552

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 9b5516b with merge base 1858086 ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

github-actions · 2025-02-18T23:32:35Z

This PR needs a `release notes:` label

If your changes are user facing and intended to be a part of release notes, please use a label starting with release notes:.

If not, please add the topic: not user facing label.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "topic: not user facing"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

larryliu0820 · 2025-02-18T23:38:01Z

examples/models/llama/model.py

+        try:
+            # assign=True: load params/buffers by assignment instead of performing an in-place copy.
+            # Because we are using device="meta", tensors do not have memory associated with them
+            # and an in-place copy is a no-op. Use assign=True in load_state_dict for this scenario.
+            missing, unexpected = self.model_.load_state_dict(
+                checkpoint,
+                strict=False,
+                assign=True,
+            )  # self.model_ = Transformer(gptconf)
+        except RuntimeError as e:


Why is this needed?

So it doesn't error out when loading examples/models/llama/params/demo_rand_params.pth or any checkpoint that is incompatible with the model architecture. We also have no way to not specify a checkpoint, I looked into removing the default val for that arg but it's going to take some work since it's relied on internally in a lot of places

jackzhxng · 2025-02-19T04:53:16Z

examples/models/llama/model.py

+        try:
+            # assign=True: load params/buffers by assignment instead of performing an in-place copy.
+            # Because we are using device="meta", tensors do not have memory associated with them
+            # and an in-place copy is a no-op. Use assign=True in load_state_dict for this scenario.
+            missing, unexpected = self.model_.load_state_dict(
+                checkpoint,
+                strict=False,
+                assign=True,
+            )  # self.model_ = Transformer(gptconf)
+        except RuntimeError as e:


So it doesn't error out when loading examples/models/llama/params/demo_rand_params.pth or any checkpoint that is incompatible with the model architecture. We also have no way to not specify a checkpoint, I looked into removing the default val for that arg but it's going to take some work since it's relied on internally in a lot of places

jackzhxng · 2025-02-19T04:56:43Z

examples/models/llama/export_llama_lib.py

 EXECUTORCH_DEFINED_MODELS = [
    "stories110m",
    "llama2",
    "llama3",
    "llama3_1",
    "llama3_2",
    "static_llama",
+    "qwen2_5",


Sorry I accidentally deleted the original comment about ordering, but I was going to say that I think this is clearer to list all the llama models first

larryliu0820

I'm ok with the changes but I'm really concerned on llama/model.py and I think we should clean it up. I'll create a separate issue.

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Feb 18, 2025

jackzhxng added 4 commits February 18, 2025 15:33

Add ci tests

9258a68

Test ci pull

1b7de2f

Add qwen to export_llama --models

5422420

Leave weights uninitialized for checkopint load fail

12d4073

jackzhxng force-pushed the jz/export_qwen_tests branch from 5ef3fef to c58edc5 Compare February 18, 2025 23:33

jackzhxng changed the base branch from main to jz/export_qwen February 18, 2025 23:33

Meta -> cpu for uninitialized weights

955b991

jackzhxng force-pushed the jz/export_qwen_tests branch from c58edc5 to 955b991 Compare February 18, 2025 23:34

jackzhxng mentioned this pull request Feb 18, 2025

Add qwen 2.5 #8355

Open

jackzhxng added the topic: not user facing label Feb 18, 2025

larryliu0820 reviewed Feb 18, 2025

View reviewed changes

pytorch deleted a comment from larryliu0820 Feb 19, 2025

jackzhxng commented Feb 19, 2025

View reviewed changes

mergennachin requested a review from guangy10 February 19, 2025 11:46

Skip executor runner for qwen2 test

9b5516b

jackzhxng force-pushed the jz/export_qwen_tests branch from 47bed4c to 9b5516b Compare February 19, 2025 16:51

larryliu0820 approved these changes Feb 19, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add tests for qwen + allow uninitialized weights in Llama model #8552

Add tests for qwen + allow uninitialized weights in Llama model #8552

jackzhxng commented Feb 18, 2025 •

edited

Loading

pytorch-bot bot commented Feb 18, 2025 •

edited

Loading

github-actions bot commented Feb 18, 2025

larryliu0820 Feb 18, 2025

jackzhxng Feb 19, 2025

jackzhxng Feb 19, 2025

jackzhxng Feb 19, 2025

larryliu0820 left a comment

Add tests for qwen + allow uninitialized weights in Llama model #8552

Are you sure you want to change the base?

Add tests for qwen + allow uninitialized weights in Llama model #8552

Conversation

jackzhxng commented Feb 18, 2025 • edited Loading

Summary

Test plan

pytorch-bot bot commented Feb 18, 2025 • edited Loading

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/8552

✅ No Failures

github-actions bot commented Feb 18, 2025

This PR needs a release notes: label

larryliu0820 Feb 18, 2025

Choose a reason for hiding this comment

jackzhxng Feb 19, 2025

Choose a reason for hiding this comment

jackzhxng Feb 19, 2025

Choose a reason for hiding this comment

jackzhxng Feb 19, 2025

Choose a reason for hiding this comment

larryliu0820 left a comment

Choose a reason for hiding this comment

jackzhxng commented Feb 18, 2025 •

edited

Loading

pytorch-bot bot commented Feb 18, 2025 •

edited

Loading

This PR needs a `release notes:` label