-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support custom docker runtimes #7589
Conversation
26349e1
to
6289a15
Compare
@notnoop can you hook me up with the right reviewers for this? We want to be able to use gvisor as the docker runtime and I feel this change only adds more power. Users can use constraints to ensure a job with a custom runtime is scheduled on machines which are configured for it |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you so much for submitting your PR - the PR looks good. I have a couple of questions/suggestions - if you address them by Monday, I'll make sure it's included in 0.11.0.
drivers/docker/driver_test.go
Outdated
@@ -1029,6 +1029,29 @@ func TestDockerDriver_SecurityOptFromFile(t *testing.T) { | |||
require.Contains(t, container.HostConfig.SecurityOpt[0], "reboot") | |||
} | |||
|
|||
func TestDockerDriver_OCIRuntime(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would suggest pattern matching against TestDockerDriver_CreateContainerConfig_*
tests - those are a bit more lightweight as they don't actually create containers.
This enables customers who want to use gvisor and have it configured on their clients.
6289a15
to
30ddbe7
Compare
…e runtime; add conflict test
30ddbe7
to
3bd675e
Compare
Upon reflecting further, we had a concern about runtime security. Given that runtime offer varying level of security, we worry about operators bypassing host or operator using a less secure runtime other than the default one or ones that provide more capabilities than intended. We've decided to target this for 0.11.1 and would love to see include/exclude/disable options in the client config. We can follow up with that after cutting 0.11.0 as well! Does that sound reasonable? Thanks a lot again. |
No concerns with timeline, thank you for updating. Regarding next steps: Concern makes sense. However docker will reject a runtime unless it is configured at the daemon, so one answer to this is that an explicit allow list already exists - it is whatever is configured at the docker daemon. I am not sure if adding client config expands on that capability? |
@@ -760,6 +760,12 @@ func (d *Driver) createContainerConfig(task *drivers.TaskConfig, driverConfig *T | |||
} | |||
hostConfig.Runtime = d.config.GPURuntimeName | |||
} | |||
if driverConfig.Runtime != "" && driverConfig.Runtime != hostConfig.Runtime { | |||
if hostConfig.Runtime != "" { | |||
return c, fmt.Errorf("runtime '%s' requested conflicts with gpu runtime '%s'", driverConfig.Runtime, hostConfig.Runtime) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what GPU runtime? I don't understand this error message :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the runtime is set above as part of the docker.nvidia_runtime
docker client configuration. looks like this errors instead of allowing a job to override the value with a configured runtime
Just poking this thread, it looks like 0.11.1 has already release. @notnoop what do you think needs to be added to this? What do you think of the argument that cluster admins can control this behavior by choosing which runtimes to install to docker on the machine in the first place? For example, if they do not want users to pick |
Thanks for the ping! I'll follow up. Sadly, 0.11.1 was an urgent fix for the panic so caused us to shift plans a bit. |
Sorry for the slowness here. I have followed up in #7932 . Thank you so much again for your contribution! |
I'm going to lock this pull request because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active contributions. |
This enables customers who want to use gvisor and have it configured on their clients.