Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve transformers-cli env reporting #31003

Merged
merged 3 commits into from
May 29, 2024
Merged

Conversation

ji-huazhong
Copy link
Contributor

@ji-huazhong ji-huazhong commented May 24, 2024

What does this PR do?

As we're getting more issues related to specific NPUs, like:

  1. Huawei NPU device_map=auto doesn't split model evenly over all devices accelerate#2368
  2. https://github.com/hiyouga/LLaMA-Factory/issues?q=is%3Aissue+npu+is%3Aopen

This PR modifies transformers-cli env to report the NPU the user is using.

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guideline,
    Pull Request section?
  • Was this discussed/approved via a Github issue or the forum? Please add a link
    to it if that's the case.
  • Did you make sure to update the documentation with your changes? Here are the
    documentation guidelines, and
    here are tips on formatting docstrings.
  • Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

cc @muellerzr

@ji-huazhong
Copy link
Contributor Author

ji-huazhong commented May 24, 2024

With this patch, for GPU:

(hf) lynn@LAPTOP:~/github/transformers$ transformers-cli env

Copy-and-paste the text below in your GitHub issue and FILL OUT the two last points.

- `transformers` version: 4.42.0.dev0
- Platform: Linux-5.10.16.3-microsoft-standard-WSL2-x86_64-with-glibc2.35
- Python version: 3.10.13
- Huggingface_hub version: 0.23.1
- Safetensors version: 0.4.2
- Accelerate version: 0.30.1
- Accelerate config:    not found
- PyTorch version (GPU?): 2.3.0+cu121 (True)
- Tensorflow version (GPU?): not installed (NA)
- Flax version (CPU?/GPU?/TPU?): not installed (NA)
- Jax version: not installed
- JaxLib version: not installed
- Using GPU in script?: <fill in>
- Using distributed or parallel set-up in script?: <fill in>
- GPU type: NVIDIA GeForce RTX 4060 Laptop GPU

for NPU:

(lynn) [root@localhost transformers-env]# transformers-cli env

Copy-and-paste the text below in your GitHub issue and FILL OUT the two last points.

- `transformers` version: 4.42.0.dev0
- Platform: Linux-5.10.0-60.125.0.152.oe2203.aarch64-aarch64-with-glibc2.26
- Python version: 3.8.18
- Huggingface_hub version: 0.23.0
- Safetensors version: 0.4.2
- Accelerate version: 0.30.0
- Accelerate config:    not found
- PyTorch version (GPU?): 2.1.0 (False)
- Tensorflow version (GPU?): not installed (NA)
- Flax version (CPU?/GPU?/TPU?): not installed (NA)
- Jax version: not installed
- JaxLib version: not installed
- Using GPU in script?: <fill in>
- Using distributed or parallel set-up in script?: <fill in>
- NPU type: Ascend910B1
- CANN version: 8.0.RC1

@ji-huazhong
Copy link
Contributor Author

cc @amyeroberts

Copy link
Collaborator

@amyeroberts amyeroberts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding this!

Just a small request to add

@ji-huazhong
Copy link
Contributor Author

Hi @amyeroberts , it's ready for re-review. :)

Copy link
Contributor

@muellerzr muellerzr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! LG2M as well

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@amyeroberts amyeroberts merged commit c886137 into huggingface:main May 29, 2024
8 checks passed
@Rocketknight1
Copy link
Member

I think this PR is causing issues in the CI because pt_cuda_available defaults to "NA", which is a truthy value, and so the if pt_cuda_available: block is executed even if torch is not present, which causes our TF and Flax tests to fail.

@Rocketknight1
Copy link
Member

Opened a fix at #31113!

@ji-huazhong ji-huazhong deleted the env branch June 17, 2024 08:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants