Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add root buildpack as interface for app image extension RFC #77

Closed
wants to merge 23 commits into from
Closed
Changes from 15 commits
Commits
Show all changes
23 commits
Select commit Hold shift + click to select a range
b97ea49
RFC: app image extensions (OS packages)
sclevine Aug 10, 2019
abf9731
Update 0000-app-image-extensions.md
sclevine Aug 10, 2019
f11dec6
Add pack CLI UX, change contract
sclevine Aug 13, 2019
6c8b006
Add metadata.toml for image status
sclevine Aug 13, 2019
0f4b7de
Fix comment indicator for metadata files
sclevine Aug 13, 2019
77a9802
Add pack extend-builder command + question about nesting
sclevine Aug 13, 2019
e1d8d91
Add per-app packages to app image ext RFC
sclevine Aug 14, 2019
95afa8d
Minor UX updates to app image ext RFC
sclevine Mar 1, 2020
ac23238
Update 0000-app-image-extensions.md
sclevine Apr 23, 2020
338e77f
Add root buildpack as interface for app image extension RFC
jkutner May 13, 2020
f2dd707
Updated os ext RFC with a different root buildpacks proposal
jkutner May 15, 2020
4d1651f
Updates from WG: remove alternatives, consolidate buildpack lists, ad…
jkutner May 21, 2020
046628b
Added details of a new extend phase
jkutner Jun 4, 2020
d500e76
Revisions to root buildpack RFC based on WG discussion
jkutner Jun 4, 2020
7a3a1a0
Revisions to root buildpack RFC based on WG discussion
jkutner Jun 5, 2020
cb2be5a
Update text/0000-app-image-extensions.md
jkutner Jun 17, 2020
f122212
Update text/0000-app-image-extensions.md
jkutner Jun 17, 2020
8c38d14
Update text/0000-app-image-extensions.md
jkutner Jun 17, 2020
fd89506
Update text/0000-app-image-extensions.md
jkutner Jun 17, 2020
416a410
Update text/0000-app-image-extensions.md
jkutner Jun 17, 2020
16e452f
Update text/0000-app-image-extensions.md
jkutner Jun 17, 2020
ee2bed1
Update text/0000-app-image-extensions.md
jkutner Jun 17, 2020
196eaaf
Update text/0000-app-image-extensions.md
jkutner Jun 17, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
243 changes: 243 additions & 0 deletions text/0000-app-image-extensions.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,243 @@
# Meta
[meta]: #meta
- Name: App Image Extensions
- Start Date: 2019-08-09
- Author(s): [Stephen Levine](https://github.com/sclevine), [Joe Kutner](https://github.com/jkutner/)
- CNB Pull Request: (leave blank)
- CNB Issue: (leave blank)
- Supersedes: (put "N/A" unless this replaces an existing RFC, then link to that RFC)

# Summary
[summary]: #summary

This RFC proposes:
1. A UX for the pack CLI for extending builders with OS packages
1. A type of buildpack that runs as the root user

# Motivation
[motivation]: #motivation

Allowing buildpacks to install OS packages dynamically during build would drastically increase build time, especially when CNB tooling is used to build or rebase many apps with similar package requirements.

Mixins already allow buildpack authors to create buildpacks that depend on an extended set of OS packages without affecting build time.
However, it is not uncommon for application code to depend on OS packages.

Given that app authors and platform maintainers experience increased build time directly, extending the stack at the app author's request may add flexibility without sacrificing performance. At the same time, we want the CNB interface to be highly orthogonal, which is why we've strived to "make everything a buildpack". Allowing some buildpacks to run with privileges would give users the flexibility they expect based on their experience with other tools like `Dockerfile`.

The advantages of using a buildpack for privileged operations are the same as for unprivileged operations: they are composable, fast (caching), modular, reuseable, and safe.
jkutner marked this conversation as resolved.
Show resolved Hide resolved

As an example, consider the use case of Acme.com. They have three teams that want to contibute privileged layers to every image built at the company. Each of these teams wants to manage their own artifacts and the mechanisms for installing them, but none of these teams are in control of the base image. The teams have special logic for when their layers should be added to an image; in some cases it's the precense of a Node.js app, and in others it's the use of Tomcat, etc. In the past, these teams were trying to contribute their layers by selectively adding them to every `Dockerfile` for every base image in the company, but this doesn't scale well. With Root Buildpacks, each team can manage their own artifacts, release cadence, and add logic to `bin/detect` to ensure the layers are added when needed.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
As an example, consider the use case of Acme.com. They have three teams that want to contibute privileged layers to every image built at the company. Each of these teams wants to manage their own artifacts and the mechanisms for installing them, but none of these teams are in control of the base image. The teams have special logic for when their layers should be added to an image; in some cases it's the precense of a Node.js app, and in others it's the use of Tomcat, etc. In the past, these teams were trying to contribute their layers by selectively adding them to every `Dockerfile` for every base image in the company, but this doesn't scale well. With Root Buildpacks, each team can manage their own artifacts, release cadence, and add logic to `bin/detect` to ensure the layers are added when needed.
As an example, consider the use case of Acme.com. They have three teams that want to contribute privileged layers to every image built at the company. Each of these teams wants to manage their own artifacts and the mechanisms for installing them, but none of these teams are in control of the base image. The teams have special logic for when their layers should be added to an image; in some cases it's the presence of a Node.js app, and in others it's the use of Tomcat, etc. In the past, these teams were trying to contribute their layers by selectively adding them to every `Dockerfile` for every base image in the company, but this doesn't scale well. With Root Buildpacks, each team can manage their own artifacts, release cadence, and add logic to `bin/detect` to ensure the layers are added when needed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was correcting the spelling of presence (currently precense).


# What it is
[what-it-is]: #what-it-is

- *root buildpack* - a special case of buildpack that is run with privileges
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could a better name for these buildpacks be stack buildpacks?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'm fine with either. I think to most end-users they will just be "buildpacks", so it's probably more a question for implementors of those buildpacks


Application developers can use root buildpacks to extend their build and/or run images.

We introduce a boolean `privileged` key in the `[buildpack]` table of `buildpack.toml`, which is defined as follows:

```
[buildpack]
privileged = <boolean (default=false)>
```

When `privileged` is set to `true`, the lifecycle will run this buildpack as the `root` user.

For each root buildpack, the lifecycle will use [snapshotting](https://github.com/GoogleContainerTools/kaniko/blob/master/docs/designdoc.md#snapshotting-snapshotting) to capture changes made during the buildpack's build phase (excluding `/tmp`, `/cnb`, and `/layers`). Alternatively to snapshotting, a platform may store a new stack iamge to cache the changes. All of the captured changes will be included in a single layer produced as output from the buildpack. The `/layers` dir MAY NOT be used to create arbitrary layers.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Eventually, we could support silces to split up kaniko's userspace snapshots (and improve performance).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually proposed that [[slices]] would work as normal a few lines after this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
For each root buildpack, the lifecycle will use [snapshotting](https://github.com/GoogleContainerTools/kaniko/blob/master/docs/designdoc.md#snapshotting-snapshotting) to capture changes made during the buildpack's build phase (excluding `/tmp`, `/cnb`, and `/layers`). Alternatively to snapshotting, a platform may store a new stack iamge to cache the changes. All of the captured changes will be included in a single layer produced as output from the buildpack. The `/layers` dir MAY NOT be used to create arbitrary layers.
For each root buildpack, the lifecycle will use [snapshotting](https://github.com/GoogleContainerTools/kaniko/blob/master/docs/designdoc.md#snapshotting-snapshotting) to capture changes made during the buildpack's build phase (excluding `/tmp`, `/cnb`, and `/layers`). As an alternative to snapshotting, the platform may store a new stack image to cache the changes. All of the captured changes will be included in a single layer produced as output from the buildpack. The `/layers` dir MAY NOT be used to create arbitrary layers.


The buildpack can then exclude directories from the layer by writing to the `launch.toml` using a new `[[excludes]]` table, for example:

```toml
[[excludes]]
paths = [ "/var" ]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this mean that root buildpack authors need to know what paths the packages the buildpack installs write to for layer exclusion?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If they want to slim down the run image, then yes they do. Otherwise they can just ignore this file. This is a good thing to point out because this proposal is tilting toward giving the buildpack authors more responsibility. But with that, I think we keep a highly orthogonal interface (i.e. everything is a buildpack).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah just thinking allowed. I think that might work well for cases where a root buildpack is specific to a particular package - say, installing curl - but not so well if they support arbitrary packages based on files in the application repo?

cache = true
```

* `paths` - a list of path globs to exclude from the run image
* `cache` - if set to `true`, the paths will be stored in the cache
Copy link
Member

@ekcasey ekcasey May 21, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By

the paths will be stored in the cache

do we mean that the snapshotted layer will be stored in the cache?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but as @matthewmcnew pointed out, we could choose to store it as an intermediate build image (no need for snapshot in that case I think).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After more thought, I'm unsure if we can get away from the snapshot layer entirely. We may need it to restrict what changes are preserved (for example, we do not want to preserve changes to the /layers or /workspace dirs)


The `[process]` and `[[slices]]` tables can be used as normal.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we are allow root buildpacks to use slices, are we assuming that the app dir is excluded from the snapshot layer?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, the app dir should be excluded from the snapshot layer. I'll call that out.


The following constraint(s) will be enforced:
* If a user attempts to create a buildpackage including both root buildpacks and non-root buildpacks, the process will fail.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the rationale behind this constraint? What is the concern it's addressing?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This addresses a concern that @sclevine had about root buildpacks proliferating through the ecosystem. Like, if we give this tool to people, they will use it too frequently. If we instead restrict its use, it forces buildpacks authors to explore alternative solutions.

I don't necessarily agree with that concern, but I'm willing to accept it in compromise (in part because it's easy to reverse).


# How it Works
[how-it-works]: #how-it-works

## Building an App with Additional Packages

Given a root buildpack with ID `example/git`, the following configuration can be used with a Golang app:
jkutner marked this conversation as resolved.
Show resolved Hide resolved

`project.toml`:
jkutner marked this conversation as resolved.
Show resolved Hide resolved
jkutner marked this conversation as resolved.
Show resolved Hide resolved
```toml
[[build.buildpacks]]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be detect.buildpacks?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, that's not a table in the latest definition of project.toml. So I'm proposing that the root buildpacks be listed in the same list as non-root buildpacks.

id = "example/git"
version = "1.0"

[[build.buildpacks]]
id = "example/golang"
version = "1.0"
```

When the following command is run, the app source code will be loaded into the image for access by the root buildpack.

`pack build sclevine/myapp`

All root buildpacks will be sliced out of the list of buildpacks, and run before non-root buildpacks. This creates a version of the current builder with additional layers and an ephemeral run image with additional layers, then does a normal build phase without running the root buildpacks.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you clarify what the "ephemeral run image" is here? How does it relate to the result of extend phase?

Copy link
Member

@sclevine sclevine Jun 9, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Confused about this also. Maybe

and an ephemeral run image with additional layers

was supposed to be replaced by the "new extend phase" below?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found the wording a bit confusing also. Is this what you meant?

Suggested change
All root buildpacks will be sliced out of the list of buildpacks, and run before non-root buildpacks. This creates a version of the current builder with additional layers and an ephemeral run image with additional layers, then does a normal build phase without running the root buildpacks.
All root buildpacks will be sliced out of the list of buildpacks, and run before non-root buildpacks. This creates a version of the current builder with additional layers and an ephemeral run image with additional layers. `Pack` then does a normal build phase, without running the root buildpacks.


After the build phase, *a new extend phase* will run the root buildpacks against the run-image to create an ephemeral base image that will be used by the export phase instead of the configured run-image.

* The run image layers are persistented as part of the app image, and can be reused on subsequent builds.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am confused about how we generate run image layers using the snapshotting mechanism. Wouldn't we need to run the root buildpacks on the run image itself to capture the correct snapshot?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also how does a root buildpack opt into reusing a run-image layer? Typically, we use the presence of a layer.toml file combined with the absence of layer content to signal layer reuse, but that doesn't work if the contents are not being written to the layers dir.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the root buildpacks must run twice: once on the build image and once on the run image. I'm not sure about reusing the run-image layer. It might not be possible.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should cache the previous run image layers locally if they can be reused. They need to be built locally anyways.

jkutner marked this conversation as resolved.
Show resolved Hide resolved
* The build image layers will be persisted as part of the cache, and can be reused on subsequent builds.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where would information about these layers (such as their digest) be stored? Is that the responsibility of the platform?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the lifecycle would manage these layers (the same way as if you are storing the cache in an image). But I do need to address how they are stored when the cache is a volume. I'll fix this up

* Any paths that are excluded using `launch.toml` and set to be cached will be stored in the cache, and can be reused on subsequent builds.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would the restorer re-apply the snapshotted build layer? Does that make it hard for the buildpack to remove and rebuild that layer if it is out of date?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the buildpack cannot rebuild with the snapshot layer applied, then it is not idempotent and the yet-to-be-named key in the buildpack.toml must specify this.


## Upgrade an App
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I understand the allure of a pack upgrade flow, I feel pretty strongly that this should just be accomplished by a complete rebuild. An "upgrade" that re-executes some but not all buildpacks seems arbitrary and confusing.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The pack upgrade flow is just supposed to run the root buildpacks, and only on the run image. This does bring up a serious issue though: the app isn't accessible during a rebase.

@jkutner I think root buildpacks will need something like a /bin/rebase executable and/or some way of storing metadata about the app. Otherwise rebase would require the original source code.

Alternatively, we could only give root buildpacks access to metadata in project.toml and not the app directory.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @matthewmcnew, if I'm understanding the sentiment correctly.

My comment:

I propose that we eliminate the notion of upgrade and still only have the concept of (re)build and rebase.

If you envision the layers as below and then associate the two operations, it becomes easy to understand and eliminates the complexity being introduced by upgrade.

+-------------------------------+   +-------+
|              app              |           |
+-------------------------------+           |
                                            |
+-------------------------------+           |
|    app deps from buildpacks   |           |  (Re)Build
+-------------------------------+           |
                                            |
+-------------------------------+           |
| app deps from root buildpacks |           |
+-------------------------------+   +-------+

+-------------------------------+   +-------+
|                               |           |
|        OS from stack          |           |  Rebase
|                               |           |
+-------------------------------+   +-------+

As an additional (possible) upside, by not providing additional "upgradability" for these set of layers we steer buildpack authors/developers to push recurring dependencies to the stack layers to improve performance.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 to @jromero and @matthewmcnew . I think adding an upgrade would be a confusion for new consumers, who know they need to do something for their app, but aren't sure what. If there's an explicit need at a later point, great, but adding it preemptively to me sounds like the confusion risks outweigh the benefits.


`pack upgrade sclevine/myapp`

This will run `pack upgrade sclevine/run`, generate an ephemeral image, and then do the equivalent of `pack rebase sclevine/myapp`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't executing arbitrary "root buildpacks" before a rebase no longer allow rebase to be a safe operation?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe it makes it possible that rebase is unsafe. paging @sclevine

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I missed this. This is probably clear by now, but the idea is that we would re-play the root buildpacks on the new run image before rebasing.


Attempting to rebase an app directly after it's been upgraded is not permitted and will fail, because the current base is now ephemeral. In effect, we are saying that the use of root buildpacks breaks rebase.

When the base image has not changed
1. If the buildpacks are configure as idempotent (the default) load the previous layers onto the base images.
1. Run the buildpacks, creating an emphemeral image.
jkutner marked this conversation as resolved.
Show resolved Hide resolved
1. Rebase the app on to the new emphemeral image.
jkutner marked this conversation as resolved.
Show resolved Hide resolved

When there is an update to either the build or run base images
1. Pull the new base image(s)
1. Run the root buildpacks against the new image(s) without loading previous layers, and create an emphemeral image.
jkutner marked this conversation as resolved.
Show resolved Hide resolved
1. Rebase the app on to the new emphemeral image.
jkutner marked this conversation as resolved.
Show resolved Hide resolved

## Example: Apt Buildpack

A buildpack that reads an `apt.toml` file to install an arbitrary list of packages would have a `buildpack.toml` like this:
matthewmcnew marked this conversation as resolved.
Show resolved Hide resolved

```toml
[buildpack]
id = "example/apt"
privileged = true
```

An end-user's `apt.toml` might look like this:

```toml
[build]
packages = [
"libpq-dev"
]

[run]
packages = [
"libpq",
"ffmpeg"
]
```

The `bin/build` would look like:

```bash
#!/usr/bin/env bash

apt update

for package in $(cat apt.toml | yj -t | jq -r ".$CNB_STACK_TYPE.packages | .[]"); do
apt install $package
done

cat << EOF > launch.toml
[[excludes]]
paths = [ "/var" ]
cache = true
EOF
```

# Alternatives
jkutner marked this conversation as resolved.
Show resolved Hide resolved

* https://github.com/buildpacks/rfcs/pull/23

# Drawbacks
[drawbacks]: #drawbacks

- Requires kaniko (or a similar tool) for snapshotting
- Fully rebasing apps with package dependencies becomes many orders of magnitude slower, due to upgrade needs
- Reliability of rebasing apps with package dependencies degrades and network and other external factors may break the process.
- Some buildpack users will not want to give root access to arbitrary buildpacks. Instead they may want to selectively whitelist certain root buildpacks.

# Questions
[questions]: #questions

* Should we allow non-root buildpacks to run in combination with root buildpacks?
* How do the packages installed by a root buildpack related (or not related) to mixins
* That is, can you use a root buildpack to install a package that satisfies the requirements of a buildpacks' required mixins?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why wouldn't this be a positive outcome?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be positive, but it's an open question because we aren't defining how it would work.


# References

Encorporated suggested UX tweaks from Javier Romero's doc: https://hackmd.io/zuzsIAh5QGKcQt_EZAcXaw?view
jkutner marked this conversation as resolved.
Show resolved Hide resolved

# Spec. Changes
[spec-changes]: #spec-changes

The following spec changes will be made as a result of accepting this proposal.

## `buildpack.toml`

This proposal adds a new key to the `[buildpack]` table in `buildpack.toml`:

```
[buildpack]
privileged = <boolean (default=false)>
? = <boolean (default=false)>
```

* `privileged` - when set to `true`, the lifecycle will run this buildpack as the `root` user
* `?` - when set to `true`, indicates that the buildpack is not idempotent. The lifecycle will provide a clean filesystem from the base image before each run (i.e. no cache). Non-idempotent buildpacks cannot be combined.

## `launch.toml`

This proposal adds a new top level table in `launch.toml`:

```
[excludes]
paths = [ "<path glob>" ]
cache = <boolean (default=false)>
```

Path globs MUST:

* Follow the pattern syntax defined in the Go standard library.
* Match zero or more files or directories.
* Be absolute

Path globs are not restricted to the app dir as with [slices](https://github.com/buildpacks/spec/blob/master/buildpack.md#launchtoml-toml).

## `project.toml`

```
[[stack.build.buildpacks]]
id = "<buildpack ID (optional)>"
version = "<buildpack version (optional default=latest)>"
uri = "<url or path to the buildpack (optional default=urn:buildpack:<id>)"

[[stack.run.buildpacks]]
id = "<buildpack ID (optional)>"
version = "<buildpack version (optional default=latest)>"
uri = "<url or path to the buildpack (optional default=urn:buildpack:<id>)"
```

### `[[stack.build.buildpacks]]`

This table MAY include a list of root buildpacks to execute against the builder image.

### `[[stack.run.buildpacks]]`

This table MAY include a list of root buildpacks to execute against the run image.

## Environment

## Provided by the Lifecycle

| Env Variable | Description | Detect | Build | Launch
|-------------------|---------------------|--------|-------|--------
| `CNB_STACK_TYPE` | 'build' or 'run' | [x] | [x] |