-
Notifications
You must be signed in to change notification settings - Fork 164
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement GPU passthrough functionality to provide hw-acceleration to inner-containers #50
Comments
As expected, gpu-related resources are properly passed inside the sys container ...
<-- Install same nvidia-driver as in the host (i.e. "440"):
<-- Device/driver properly seen within sys container:
<-- Install nvidia-runtime within sys container:
<-- Check everything's looking good so far:
<-- Finally, let's launch a cuda-app as an L2 container:
<-- Let's pick an earlier cuda-app image -- looks like our nvidia driver (440) may not be the latest.
Hmm, need to check what is this nvidia prestart hook doing to trigger an EPERM while mounting this resource ... |
Will this work for multiple GPUs? |
@evberrypi, we haven't initiated this effort yet, so we may find surprises that could easily influence this feature's scope, but yes, we do intend to support multiple GPUs. Ideally, our runtime should be capable of detecting the available GPUs and expose them automatically within the system-container. Alternatively, the user should be able to specify which GPU to utilize inside the container, and have only that one exposed in the container's rootfs. Would this meet your requirements? Also, if you don't mind, can you please describe what's the use-case / setup you have in mind? Thanks. |
The use case would be running sysbox to replace some CI build/test steps that need to be ran on special, pet like servers with multiple GPUs in favor of something that can run across various servers. Support for 2 or more GPUs is desired. Auto detecting the # of GPUs and allocating them could work, but preferably specifying the number of GPUs to pass to the container would be ideal and would absolutely meet requirements. Thanks for the prompt reply! |
We are also working on this. Nvidia-docker is unable to deal with sysbox-modified userns-remap or cgroups. But
It works but we don't know whether we missed something or something is not useful. Besides,
After starting the inner docker, a |
Hi @SoloGao, thanks for sharing that info, much appreciated. We've not had the cycles to add Nvidia GPU support to Sysbox yet, but your findings will certainly help when we do so. Is this something that is high priority for you? What is the use-case you have in mind? |
@SoloGao, that's excellent feedback, thanks for that!
We are also thinking about bind-mounting these libraries to avoid copying all these content back and forth. Ideally, sysbox should be able to hide all this complexity from the user, but i'm not sure how far we can go with this zero-touch approach. The EPERM you are getting is expected as you're attempting to access a host-owned resource while being in a separate user-namespace. We should be able to fix that too.
Right, we also need to think about how we simplify this process for the user, doesn't look like an easy task given that we are relying on the regular oci-runc at this level. Let's keep this communication channel open to exchange notes on your findings as well as our planning in this area. This is an important feature for us, we will start working on it asap. One question for you. Have you tried to share one nvidia device across two separate sys-containers + inner-containers? |
Hello @ctalledo, just a short answer to your question.
GPGPU support is the must-have features for us. Basically, we are using docker to run GPGPU-intensive tasks like deep-learning on servers. NVIDIA offers lots of pre-configured docker images in https://ngc.nvidia.com and has built a workflow to run different versions of CUDA or DL frameworks easily. To safely start/stop/remove containers by user rather than admin, Podman and Sysbox is the only choices. Moreover, all containers need to expose ports for services like Tensorboard. So in order to make port arrangement manually, we opt to docker in docker with Sysbox. |
Hi @rodnymolina, thanks for the reply and explanation.
In my opinion, forking the https://github.com/NVIDIA/nvidia-container-toolkit might be a good starting point. NVIDIA offers a set of tools to start docker with GPU support. The docs: https://docs.nvidia.com/datacenter/cloud-native/index.html. Users might only need a patched nvidia-container-toolkit to run system/inner docker with GPGPU support. They use GoLang to do the stuffs, a superset of what I currently did with lots of validation works. However, I don't have that much time digging into the code, so I just grab the results and reproduce them now. Besides, the cgroups problem might need to solve for nvidia-container-toolkit.
Yes, that works flawlessly even on [sys-A(inner-a, inner-b), sys-B(inner-c, inner-d)] scheme. |
Thanks for your detailed responses @SoloGao, it all makes sense. Btw, I've already looked at nvidia-toolkit in the past and that's something that will certainly keep in mind. One last thing. If possible, could you please ping me when have a chance? (rmolina@nestybox.com) There are a couple of points that I would like to clarify about your setup to make sure that we fully address your use-case. Thanks! |
@rodnymolina any effort on the GPU side ? We will love to see that too. |
FYI: Another user for Sysbox is looking to use hardware accelerators with Sysbox towards the end of 2021. |
Me too |
FYI: some GPU functionality does work inside a Sysbox container currently, as described in this comment in issue #452. |
Are there any updates on this issue? Our use case is running heavy inference/scientific workflows inside nested containers on a Kubernetes pod. |
Hi Roshan (@r614), unfortunately no updates yet. As Docker recently acquired Nestybox, we are currently busy integrating Sysbox into Docker Desktop but should get some more cycles to work on Sysbox improvements within a couple of months (and GPU passthrough is one of the top items). Thanks for your patience! |
Thanks for the prompt reply - appreciate it! Would love to know to find out when you guys start work on this, happy to help test it out. |
So looking forward to seeing this feature! Is there any plan to release this in 2023? |
Hi @kkangle, we are hoping to get some cycles to work on this soon. What's the use case you have in mind, if you don't mind sharing? |
@ctalledo Sysbox supports the gpu function. Can we speed up the solution? Now there is a urgent project |
@ctalledo we use sysbox k8s in docker ,hoping k8s runing gpu,but it is errors : |
Another Use CaseMy university wants to offer ML execution as a CI service in its private GitLab instance, so that it can offer more ML projects in the future. I'm working on this for my Bachelor's thesis, so I'll have to just accept the inherent danger of |
Unfortunately we still haven't had the cycles to work on this. However, there have been some users that have had some limited success exposing GPUs inside Sysbox containers. See here for example. |
@SoloGao |
@ctalledo We would like to be able to use sysbox to run containerized tests on Nvidia GPUs under kubernetes without privileged mode. I'm curious, do you have an estimate on the amount of work required to add this sort of feature to sysbox? |
Hi @christopherhesse, unfortunately I don't have an estimate at this time for a couple of reasons:
If I may ask, what's the big advantage of using Sysbox in your scenario (e.g., why not use regular containers)? |
We're unable to launch regular containers inside a kubernetes pod without privileged mode. Is that supported already somehow? |
With Sysbox as the runtime for the pod yes, without it I don't believe so. |
@ctalledo Great, then IIUC this would be the big advantage of using Sysbox. Does that answer your question? Within a kubernetes pod, without privileged mode, we want to run a transient container for our tests and it seems like this is the most promising option there. |
Hi @christopherhesse, yes that's a very common use case: running Docker inside an unprivileged K8s pod; the pod is launched with K8s + Sysbox. This way you get a lot more isolation / security around the pod as opposed to using a privileged one. Hope that helps; if you run into any trouble let us know please. Thanks. |
@ctalledo Thanks for the confirmation! I am running into the issue that GPUs are not supported, since the tests require GPUs. |
We're using this feature to simulate flocks of robots, where each sysbox runs a replica of our software, as it would run on the robot. This would be a very useful feature for us, as it lets our simulations stay closer to what our real hardware presents. Essentially we're using a nestybox runtime to simulate a complete robot stack, so we can test things like swarm-SLAM. This is, presumably, possible without implementing nvidia-container-runtime capabilities, but it makes the job a lot harder and means porting to another compute cluster becomes a nightmare. |
@ctalledo, wanted to check to see if there was any appetite for either scoping this work out, or helping build out the capability. |
I'm using Coder as online development environment, and using sysbox in the dev container to enable dockerd-in-docker, now I need to work on a new project that do Machine Learning accelerated by GPU, hope this would be supported. |
looking forward to this features too, we are going to use sysbox as the runtime for CI workload on top of kubernetes, the workload needs GPU, so this will be a blocker for us. And BTW, if there is a scoping of the work, or changes required to make this happen, I will be happy to contribute too |
Would be great to see this feature, what are the chances of this getting worked on in the near future? |
Our goal here is to allow Sysbox container hierarchies to make use of hardware-acceleration capabilities offered by system devices. Device 'passthrough' is a concept that applies naturally to system-containers, and as such, it has been on our mind since Sysbox's early days, but it recently came up as part of a conversation with @jamierajewski (thanks for that).
A couple of scenarios where this would be useful:
Even though most of the concepts described here are applicable to any GPU, we will limit the scope of this issue to Nvidia GPU's; let's create separate issues for other GPUs.
At high-level, these are some of the requirements that Sysbox would need to meet:
Sysbox should identify the GPU devices in the host and expose them automatically to the sysbox containers (through 'devices' oci-spec attribute).
Sysbox should provide a mechanism that allows cuda-toolkit and related nvidia tools, which are required at host-level, to be shared (bind-mounted?) with sysbox containers. This would address two problems:
Sysbox should allow proper execution of the nvidia-container-runtime within the system-containers, which should expose all the required abstractions for nvidia-runtime to operate as if it were running in the host.
This list of requirements will obviously change as we further understand the problem.
The text was updated successfully, but these errors were encountered: