-
Notifications
You must be signed in to change notification settings - Fork 71
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add root buildpack as interface for app image extension RFC #77
Conversation
9dd62f0
to
97df244
Compare
|
||
# What it is | ||
[what-it-is]: #what-it-is | ||
|
||
## Specification | ||
- *root buildpack* - a special case of buildpack that is run with privileges |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could a better name for these buildpacks be stack buildpacks?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i'm fine with either. I think to most end-users they will just be "buildpacks", so it's probably more a question for implementors of those buildpacks
I like the idea of buildpacks being the extensibility mechanism for stack extensions! |
|
||
`pack upgrade sclevine/myapp` | ||
|
||
This will run `pack upgrade sclevine/run`, generate an ephemeral image, and then do the equivalent of `pack rebase sclevine/myapp`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't executing arbitrary "root buildpacks" before a rebase no longer allow rebase to be a safe operation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe it makes it possible that rebase is unsafe. paging @sclevine
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry I missed this. This is probably clear by now, but the idea is that we would re-play the root buildpacks on the new run image before rebasing.
Do root buildpacks participate in |
Note from chat on 5/20: would be good to ensure that the app developer doesn't need to pick whether the root buildpack runs on the run image vs. build image vs. both. |
|
||
```toml | ||
[[excludes]] | ||
paths = [ "/var" ] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this mean that root buildpack authors need to know what paths the packages the buildpack installs write to for layer exclusion?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If they want to slim down the run image, then yes they do. Otherwise they can just ignore this file. This is a good thing to point out because this proposal is tilting toward giving the buildpack authors more responsibility. But with that, I think we keep a highly orthogonal interface (i.e. everything is a buildpack).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah just thinking allowed. I think that might work well for cases where a root buildpack is specific to a particular package - say, installing curl - but not so well if they support arbitrary packages based on files in the application repo?
@jabrown85 yes, root buildpacks can |
``` | ||
|
||
* `paths` - a list of path globs to exclude from the run image | ||
* `cache` - if set to `true`, the paths will be stored in the cache |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By
the paths will be stored in the cache
do we mean that the snapshotted layer will be stored in the cache?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but as @matthewmcnew pointed out, we could choose to store it as an intermediate build image (no need for snapshot in that case I think).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After more thought, I'm unsure if we can get away from the snapshot layer entirely. We may need it to restrict what changes are preserved (for example, we do not want to preserve changes to the /layers
or /workspace
dirs)
* `paths` - a list of path globs to exclude from the run image | ||
* `cache` - if set to `true`, the paths will be stored in the cache | ||
|
||
The `[process]` and `[[slices]]` tables can be used as normal. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we are allow root buildpacks to use slices, are we assuming that the app dir is excluded from the snapshot layer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, the app dir should be excluded from the snapshot layer. I'll call that out.
text/0000-app-image-extensions.md
Outdated
|
||
All root buildpacks will be sliced out of the list of buildpacks, and run before non-root buildpacks. This creates a version of the current builder with additional layers and an ephemeral run image with additional layers, then does a normal `pack build`. | ||
|
||
* The run image layers are persistented as part of the app image, and can be reused on subsequent builds. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am confused about how we generate run image layers using the snapshotting mechanism. Wouldn't we need to run the root buildpacks on the run image itself to capture the correct snapshot?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also how does a root buildpack opt into reusing a run-image layer? Typically, we use the presence of a layer.toml
file combined with the absence of layer content to signal layer reuse, but that doesn't work if the contents are not being written to the layers dir.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the root buildpacks must run twice: once on the build image and once on the run image. I'm not sure about reusing the run-image layer. It might not be possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should cache the previous run image layers locally if they can be reused. They need to be built locally anyways.
|
||
* The run image layers are persistented as part of the app image, and can be reused on subsequent builds. | ||
* The build image layers will be persisted as part of the cache, and can be reused on subsequent builds. | ||
* Any paths that are excluded using `launch.toml` and set to be cached will be stored in the cache, and can be reused on subsequent builds. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would the restorer re-apply the snapshotted build layer? Does that make it hard for the buildpack to remove and rebuild that layer if it is out of date?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the buildpack cannot rebuild with the snapshot layer applied, then it is not idempotent and the yet-to-be-named key in the buildpack.toml
must specify this.
|
||
`project.toml`: | ||
```toml | ||
[[build.buildpacks]] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be detect.buildpacks
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, that's not a table in the latest definition of project.toml. So I'm proposing that the root buildpacks be listed in the same list as non-root buildpacks.
All root buildpacks will be sliced out of the list of buildpacks, and run before non-root buildpacks. This creates a version of the current builder with additional layers and an ephemeral run image with additional layers, then does a normal `pack build`. | ||
|
||
* The run image layers are persistented as part of the app image, and can be reused on subsequent builds. | ||
* The build image layers will be persisted as part of the cache, and can be reused on subsequent builds. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where would information about these layers (such as their digest) be stored? Is that the responsibility of the platform?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the lifecycle would manage these layers (the same way as if you are storing the cache in an image). But I do need to address how they are stored when the cache is a volume. I'll fix this up
text/0000-app-image-extensions.md
Outdated
|
||
This will run `pack upgrade sclevine/run`, generate an ephemeral image, and then do the equivalent of `pack rebase sclevine/myapp`. | ||
|
||
Attempting to rebase an app directly after it's been upgraded is not permitted and will fail, because the current base is now ephemeral. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible you could elaborate on this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added "In effect, we are saying that the use of root buildpacks breaks rebase." Does that help?
Signed-off-by: Joe Kutner <jpkutner@gmail.com>
Sounds good to me. I have no skin in this game (I'm not planning on using this feature), and it doesn't seem any more dangerous than any other proposal. |
@sclevine I'm still a little unsettled with:
I don't think it does a good job of accounting for the python+sqlite case (it pushes the burden onto the end-user, which is the opposite of the buildpacks value-prop). There are many other cases like this (such as Rails ActiveStorage with ffmpeg). What about this: if we allow root buildpacks to satisfy mixin requirements AND opt-out of running when those packages are already provided by the stack (such as by finding them in the build plan), then it eliminates some of the performance concerns. With regard to an ecosystem of buildpackages that rampantly include root buildpacks, I'm just not worried about it. Most end-users will use one of the following:
|
I definitely agree that end-user developers should not have to deal with this. My intent isn't to force developers to specify the root buildpacks they want. I want builder authors / platform maintainers to have control over this instead. E.g., the heroku builder could supply an apt-buildpack in order.toml for every group.
How would a platform maintainer prevent developers from installing OS packages that they don't want them to use? They would end up needing to fork meta-buildpacks to remove/replace root buildpacks.
Aren't we trying to build a large community registry of composable buildpacks in the long run? |
How will root buildpacks contribute mixins in the build plan? Will all mixin validation occur in detect? Will Buildpacks be able to indicate if they provide mixins, giving platforms an early indication if a builder has a chance of satisfying buildpack mixin requirements? |
|
||
`pack build sclevine/myapp` | ||
|
||
All root buildpacks will be sliced out of the list of buildpacks, and run before non-root buildpacks. This creates a version of the current builder with additional layers and an ephemeral run image with additional layers, then does a normal build phase without running the root buildpacks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you clarify what the "ephemeral run image" is here? How does it relate to the result of extend phase?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Confused about this also. Maybe
and an ephemeral run image with additional layers
was supposed to be replaced by the "new extend phase" below?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found the wording a bit confusing also. Is this what you meant?
All root buildpacks will be sliced out of the list of buildpacks, and run before non-root buildpacks. This creates a version of the current builder with additional layers and an ephemeral run image with additional layers, then does a normal build phase without running the root buildpacks. | |
All root buildpacks will be sliced out of the list of buildpacks, and run before non-root buildpacks. This creates a version of the current builder with additional layers and an ephemeral run image with additional layers. `Pack` then does a normal build phase, without running the root buildpacks. |
* The build image layers will be persisted as part of the cache, and can be reused on subsequent builds. | ||
* Any paths that are excluded using `launch.toml` and set to be cached will be stored in the cache, and can be reused on subsequent builds. | ||
|
||
## Upgrade an App |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While I understand the allure of a pack upgrade flow, I feel pretty strongly that this should just be accomplished by a complete rebuild. An "upgrade" that re-executes some but not all buildpacks seems arbitrary and confusing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The pack upgrade
flow is just supposed to run the root buildpacks, and only on the run image. This does bring up a serious issue though: the app isn't accessible during a rebase.
@jkutner I think root buildpacks will need something like a /bin/rebase
executable and/or some way of storing metadata about the app. Otherwise rebase would require the original source code.
Alternatively, we could only give root buildpacks access to metadata in project.toml and not the app directory.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with @matthewmcnew, if I'm understanding the sentiment correctly.
My comment:
I propose that we eliminate the notion of upgrade
and still only have the concept of (re)build
and rebase
.
If you envision the layers as below and then associate the two operations, it becomes easy to understand and eliminates the complexity being introduced by upgrade
.
+-------------------------------+ +-------+
| app | |
+-------------------------------+ |
|
+-------------------------------+ |
| app deps from buildpacks | | (Re)Build
+-------------------------------+ |
|
+-------------------------------+ |
| app deps from root buildpacks | |
+-------------------------------+ +-------+
+-------------------------------+ +-------+
| | |
| OS from stack | | Rebase
| | |
+-------------------------------+ +-------+
As an additional (possible) upside, by not providing additional "upgradability" for these set of layers we steer buildpack authors/developers to push recurring dependencies to the stack layers to improve performance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 to @jromero and @matthewmcnew . I think adding an upgrade
would be a confusion for new consumers, who know they need to do something for their app, but aren't sure what. If there's an explicit need at a later point, great, but adding it preemptively to me sounds like the confusion risks outweigh the benefits.
I'm starting to worry about @matthewmcnew's point about rebasing being potentially unsafe after root buildpacks are used. Unless we guarantee that root buildpacks only install packages that remain ABI compatible on updates, rebase could change app behavior. This could break apps in production and cause users to lose faith that the rebase process is safe. If we decide that rebase isn't safe in those cases, then any use of root buildpacks will drastically hurt performance for at-scale security patches. Full rebuilds would be necessary for all images. We can keep rebase safe with root buildpacks, but only if platforms like kpack can whitelist which root buildpacks are available. If arbitrary meta-buildpacks can contain root buildpacks, then this becomes much harder. More generally, I feel like root buildpacks make it too easy for buildpack authors to degrade the end-user experience (for both developers and operators) and that we need some sort of control to prevent that from happening. That control shouldn't hurt interoperability (e.g., prevent heroku buildpacks from being used with kpack + many apps). I'm not strongly opinionated about how we achieve this, but disallowing root buildpacks and regular buildpacks to mix in a buildpackage would be a start. Would you be open to that option if we allowed root buildpacks to be installed as enforced pre-build hooks on builders? Then developers would never need to think about them in the OS package case, even when manually specifying buildpacks. If a buildpack needs a package, either the stack would contain a given package, or the package would get automatically installed. |
Is it the case that a buildpack needs a package, or that an app needs a package? As an example, I might write an app that calls out - via subprocess - to headless chrome (installed via some deb package) to do Things™. The buildpack didn't link to chrome for any installation purpose, but it's a runtime requirement. Asking as I'm not sure if that would impact the end user interface here. |
@josegonzalez i think it's both. Some buildpacks will depend on a package, and some apps will depend on a package. Ideally the former is completely hidden from the end-user (only the buildpack author is aware of it). But I think we're afraid that any solution will force the end-user (app developer) to be aware of it. |
Based on a breakout discussion with @hone @sclevine and @ekcasey, I'm going to try and tweak this RFC such that the following are true:
What is missing, IMO, is a way for an end-user to extend the stack (i.e. to provide their own root buildpacks). A few options I'd like to consider for this are:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Spelling
|
||
The advantages of using a buildpack for privileged operations are the same as for unprivileged operations: they are composable, fast (caching), modular, reuseable, and safe. | ||
|
||
As an example, consider the use case of Acme.com. They have three teams that want to contibute privileged layers to every image built at the company. Each of these teams wants to manage their own artifacts and the mechanisms for installing them, but none of these teams are in control of the base image. The teams have special logic for when their layers should be added to an image; in some cases it's the precense of a Node.js app, and in others it's the use of Tomcat, etc. In the past, these teams were trying to contribute their layers by selectively adding them to every `Dockerfile` for every base image in the company, but this doesn't scale well. With Root Buildpacks, each team can manage their own artifacts, release cadence, and add logic to `bin/detect` to ensure the layers are added when needed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As an example, consider the use case of Acme.com. They have three teams that want to contibute privileged layers to every image built at the company. Each of these teams wants to manage their own artifacts and the mechanisms for installing them, but none of these teams are in control of the base image. The teams have special logic for when their layers should be added to an image; in some cases it's the precense of a Node.js app, and in others it's the use of Tomcat, etc. In the past, these teams were trying to contribute their layers by selectively adding them to every `Dockerfile` for every base image in the company, but this doesn't scale well. With Root Buildpacks, each team can manage their own artifacts, release cadence, and add logic to `bin/detect` to ensure the layers are added when needed. | |
As an example, consider the use case of Acme.com. They have three teams that want to contribute privileged layers to every image built at the company. Each of these teams wants to manage their own artifacts and the mechanisms for installing them, but none of these teams are in control of the base image. The teams have special logic for when their layers should be added to an image; in some cases it's the presence of a Node.js app, and in others it's the use of Tomcat, etc. In the past, these teams were trying to contribute their layers by selectively adding them to every `Dockerfile` for every base image in the company, but this doesn't scale well. With Root Buildpacks, each team can manage their own artifacts, release cadence, and add logic to `bin/detect` to ensure the layers are added when needed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was correcting the spelling of presence (currently precense).
|
||
When `privileged` is set to `true`, the lifecycle will run this buildpack as the `root` user. | ||
|
||
For each root buildpack, the lifecycle will use [snapshotting](https://github.com/GoogleContainerTools/kaniko/blob/master/docs/designdoc.md#snapshotting-snapshotting) to capture changes made during the buildpack's build phase (excluding `/tmp`, `/cnb`, and `/layers`). Alternatively to snapshotting, a platform may store a new stack iamge to cache the changes. All of the captured changes will be included in a single layer produced as output from the buildpack. The `/layers` dir MAY NOT be used to create arbitrary layers. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For each root buildpack, the lifecycle will use [snapshotting](https://github.com/GoogleContainerTools/kaniko/blob/master/docs/designdoc.md#snapshotting-snapshotting) to capture changes made during the buildpack's build phase (excluding `/tmp`, `/cnb`, and `/layers`). Alternatively to snapshotting, a platform may store a new stack iamge to cache the changes. All of the captured changes will be included in a single layer produced as output from the buildpack. The `/layers` dir MAY NOT be used to create arbitrary layers. | |
For each root buildpack, the lifecycle will use [snapshotting](https://github.com/GoogleContainerTools/kaniko/blob/master/docs/designdoc.md#snapshotting-snapshotting) to capture changes made during the buildpack's build phase (excluding `/tmp`, `/cnb`, and `/layers`). As an alternative to snapshotting, the platform may store a new stack image to cache the changes. All of the captured changes will be included in a single layer produced as output from the buildpack. The `/layers` dir MAY NOT be used to create arbitrary layers. |
The `[process]` and `[[slices]]` tables can be used as normal. | ||
|
||
The following constraint(s) will be enforced: | ||
* If a user attempts to create a buildpackage including both root buildpacks and non-root buildpacks, the process will fail. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the rationale behind this constraint? What is the concern it's addressing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This addresses a concern that @sclevine had about root buildpacks proliferating through the ecosystem. Like, if we give this tool to people, they will use it too frequently. If we instead restrict its use, it forces buildpacks authors to explore alternative solutions.
I don't necessarily agree with that concern, but I'm willing to accept it in compromise (in part because it's easy to reverse).
|
||
* Should we allow non-root buildpacks to run in combination with root buildpacks? | ||
* How do the packages installed by a root buildpack related (or not related) to mixins | ||
* That is, can you use a root buildpack to install a package that satisfies the requirements of a buildpacks' required mixins? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why wouldn't this be a positive outcome?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be positive, but it's an open question because we aren't defining how it would work.
Signed-off-by: Joe Kutner <jpkutner@gmail.com> Co-authored-by: David Freilich <dfreilich@vmware.com>
Signed-off-by: Joe Kutner <jpkutner@gmail.com> Co-authored-by: David Freilich <dfreilich@vmware.com>
Signed-off-by: Joe Kutner <jpkutner@gmail.com> Co-authored-by: David Freilich <dfreilich@vmware.com>
Signed-off-by: Joe Kutner <jpkutner@gmail.com> Co-authored-by: David Freilich <dfreilich@vmware.com>
Signed-off-by: Joe Kutner <jpkutner@gmail.com> Co-authored-by: David Freilich <dfreilich@vmware.com>
Signed-off-by: Joe Kutner <jpkutner@gmail.com> Co-authored-by: David Freilich <dfreilich@vmware.com>
Signed-off-by: Joe Kutner <jpkutner@gmail.com> Co-authored-by: David Freilich <dfreilich@vmware.com>
Signed-off-by: Joe Kutner <jpkutner@gmail.com> Co-authored-by: David Freilich <dfreilich@vmware.com>
Readable