Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove check of device consistency for balanced_low_0. #2591

Closed
wants to merge 1 commit into from

Conversation

xkszltl
Copy link

@xkszltl xkszltl commented Mar 27, 2024

Seems balanced_low_0 can leave GPU 0 empty and breaks this check. According to the discussion this check may be outdated.

Resolve #2429

Seems `balanced_low_0` can leave GPU 0 empty and breaks this check.
According to the discussion this check may be outdated.

Resolve huggingface#2429
@xkszltl
Copy link
Author

xkszltl commented Mar 27, 2024

CC @SunMarc @younesbelkada

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@SunMarc
Copy link
Member

SunMarc commented Apr 2, 2024

Hi @xkszltl, thanks for the PR. Could you run the bitsandbytes tests to see if this PR do not break any tests ? Otherwise, I'll do that after we merge a pretty important PR to fix the quantization CI !

@xkszltl
Copy link
Author

xkszltl commented Apr 2, 2024

Feel free to sequence it after your work.
I'm busy with something else as well and local resources are all occupied.

BTW there's no CI coverage?

@xkszltl
Copy link
Author

xkszltl commented Apr 2, 2024

This won't break anything as long as it compiles, because it's just removing an assertion.

Copy link
Contributor

@younesbelkada younesbelkada left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm not sure we should remove this check as it is important to keep it to educate users about how to train quantized models using multiple GPUs. Is there a way to make sure we are under balanced_low_0 regime and relax the constraint only for that case?

@xkszltl
Copy link
Author

xkszltl commented Apr 5, 2024

I'm not familiar with the details here but design wise that's a bad choice due to additional coupling. Mem affinity can be versatile and low 0 is only 1 of the profile to generate a counter example. I believe in NPP the entire GPU array is treated as a group so it shouldn't matter which is the model device.

@xkszltl
Copy link
Author

xkszltl commented Apr 19, 2024

Any other concerns?

Copy link

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

@SunMarc
Copy link
Member

SunMarc commented May 14, 2024

Closing this since this issue should be solved by this PR

@SunMarc SunMarc closed this May 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Accelerate refuse to work on balanced_low_0 when GPU 0 is not filled.
4 participants