-
Notifications
You must be signed in to change notification settings - Fork 358
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add multi-gpu support for cupy scheme #5007
Conversation
GitHub CoPilot suggested the following to allow the user to exclude GPUs if (for e.g.) someone else is using it heavily by using a env variable:
|
I might be missing something, but I think CoPilot is over-engineering this, possibly because it doesn't know how CUDA device IDs work in cupy (the cupy ID doesn't match the physical ID). I think what you're describing is already supported with the changes I added. CUDA has an environment variable
So with the current code, one could use the second option to exclude e.g. physical device 2 and there is no difference to cupy. Example Here's an example that tries to put an array on the second GPU for different values of
|
Okay, I think CoPilot is just reflecting my own limited knowledge here. LGTM! |
This PR adds support for using multiple GPUs via MPI with the cupy backend. It is a follow-on to #4952
Standard information about the request
This is a: new feature
This change affects: in theory, any code that uses cupy schemes but, since the cupy scheme is relatively new, I don't believe it is being used it production
This change changes: GPU support
This change:
This change will: N/A
Motivation
The current cupy scheme is limited to a single GPU but using multiple GPUs when they are available can further accelerate analyses.
Contents
I've added logic to check if MPI is being used and then set the device number accordingly based on the total number of devices visible to cupy
Links to any issues or associated PRs
Follow on from #4952
Testing performed
I've tested this for the pre-merger work that was used when testing #4952
Additional notes
In the current version, if the user specifies a device number this takes priority over the MPI-based logic. We could consider swapping this so the value is ignored when using MPI.
I based this implementation on what is described in this blogpost: https://blog.hpc.qmul.ac.uk/strategies-multi-node-gpu/#mpi-process-for-each-gpu-pure-mpi-approach