-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GPU Support #849
Comments
That would be super helpful to have support for GPU. |
GPU support in Firecracker is very hard/tricky at the moment. With current GPU hardware, there's two major problems:
As a result there is no known path to supporting GPUs in Firecracker. |
@amrragab8080, we'll be looking at this as part of #1179. |
Why do you want to maintain the ability to oversubscribe memory? |
Oversubscription is a core part of makes Firecracker a great way to isolate serverless workloads; that's why we took on a tenet around it [1]. [1] https://github.com/firecracker-microvm/firecracker/blob/master/CHARTER.md |
Why is over-subscription a great way to isolate serverless workloads? I genuinely don't know, so the reasoning that led to the existence of the tenet is not self-evident to me. |
Like all services, serverless compute providers want to keep their servers busy and to improve their overall utilization. Ideally, every CPU cycle on the service provider's servers is running user code, and every byte of RAM is filled with user data. If servers are sitting idle, that’s inefficient. A part of solving this optimization problem is having the ability to oversubscribe a given server's hardware capacity with workloads who's hardware resource usage is statistically uncorrelated, or, even better, with workloads selected specifically to pack well together. |
What you appear to be saying is that resource over-subscription helps the hosting service (e.g. AWS Lambda or Fargate) to lower their hardware costs. (Which, in turn, passes savings on to customers...presumably.) That is not the same as being great for isolating workloads. It seems to be the opposite. Particularly, in the case that all workloads attempt to utilize their full resource reservations at the same time. It sounds like the design here is to bet on the workloads not calling in all their debts. How ingrained into the Firecracker implementation is this resource-over-subscription tenet? Like, would it be remotely feasible to add a feature flag that turns over-subscription off? P.S. As an aside, the Firecracker tenets don't seem to align with the Fargate project. Specifically the tenet that calls out favoring transient or stateless workloads over long-running or persistent workloads. The Fargate docs do not place similar restrictions on its workloads (AFAICT). |
Great for isolating serverless workloads, which are bursty and pay-only-when-running. Take a look at https://www.youtube.com/watch?v=QdzV04T_kec , there some more detail there around how Lambda multiplexes workloads.
Well, it's a tenet so we stick to it unless there's a very good reason to change it.
You're quite right here :) This tenet started out as a powerful simplifying assumption, but as you pointed out, it doesn't quite apply to all the serverless container workloads; we might let go of the "transient and stateless" part. |
What is the reason for this? Is the attack surface of e.g virtio-gpu or Venus excessive? |
In theory it is possible to do better by dynamically manipulating guest IOMMU mappings.
Does this also apply to SR-IOV capable GPUs? What about e.g. attacks in which the guest overwrites the GPU’s vBIOS? |
This is becoming increasingly more important to support. It may be difficult, but we need to find a way to do this. This can also be managed with Nvidia MIG (cutting the physical GPU into slices) and exposing a specific slice to the VM. This does provide capacity limitations in the current state, but only on GPU capacity, which is widely accepted at the moment. Also - this issue is not "Closed". It is not implemented and users are still asking for this. We should move this conversation to #1179 since it is still open |
Public facing api doesnt seem to have support yet for passthrough pci devices, namely gpu is this technically feasible?
The text was updated successfully, but these errors were encountered: