This RFC proposes a new resource interface to replace the existing resource interface.
As part of this proposal, the interface will now be versioned, starting at 2.0. Today's resource interface (documented here) will be called version 1, even though it was never really versioned.
The introduction of this new interface will be gradual, allowing Concourse users to use a mix of v1 and v2 resources throughout their pipelines. While the new interface is defined in terms of entirely new concepts like spaces, v1 resources will be silently 'adapted' to v2 automatically.
-
Support for multi-branch workflows and build matrixes:
-
Support for creating new branches dynamically (as spaces):
-
Support for creating multiple versions at once:
-
Support for deleting versions:
-
Having resource metadata immediately available via
check
: -
Unifying
source
andparams
as justconfig
so that resources don't have to care where configuration is being set in pipelines: -
Improving stability of reattaching to builds by reading resource responses from files instead of
stdout
: -
Ensuring resource version history is always correct and up-to-date, enabling it to be deduped and removing the need for purging history and removing/renaming resources.
-
Closing gaps in the resource interface that turned them into a "local maxima" and resulted in their being used in somewhat cumbersome ways (notifications, partially-implemented resources, etc.)
- TODO: document 'info'
- TODO: make metadata more flexible?
// Space is a name of a space, e.g. "master", "release/3.14", "1.0".
type Space string
// Config is a black box containing all user-supplied configuration, combining
// `source` in the resource definition with `params` from the step (in the
// case of `get` or `put`).
type Config map[string]interface{}
// Version is a key-value identifier for a version of a resource, e.g.
// `{"ref":"abcdef"}`, `{"version":"1.2.3"}`.
type Version map[string]string
// Metadata is an ordered list of metadata fields to display to the user about
// a resource version. It's ordered so that the resource can decide the best
// way to show it.
type Metadata []MetadataField
// MetadataField is an arbitrary key-value to display to the user about a
// version of a resource.
type MetadataField struct {
Name string `json:"name"`
Value string `json:"value"`
}
The check
command will be invoked with the following JSON structure on
stdin
:
// CheckRequest contains the resource's configuration and latest version
// associated to each space.
type CheckRequest struct {
Config Config `json:"config"`
From map[Space]Version `json:"from"`
ResponsePath string `json:"response_path"`
}
The check
script responds by writing JSON objects ("events") to a file
specified by response_path
. Each JSON object has an action
and a different
set of fields based on the action.
The following event types may be emitted by check
:
-
default_space
: Emitted when the resource has learned of a space which should be considered the "default", e.g. the default branch of agit
repo or the latest version available for a semver'd resource.Required fields for this event:
space
: The name of the space.
-
discovered
: Emitted when a version is discovered for a given space. These must be emitted in chronological order (relative to otherdiscovered
events for the given space - other events may be intermixed).Required fields for this event:
space
: The space the version is in.version
: The version object.metadata
: A list of JSON objects withname
andvalue
, shown to the user.
-
reset
: Emitted when a given space's "current version" is no longer present (e.g. someone rangit push -f
). This has the effect of marking all currently-recorded versions of the space 'deleted', after which the resource will emit any and all versions from the beginning, thus 'un-deleting' anything that's actually still there.Required fields for this event:
space
: The name of the space.
The first request will have an empty object as from
.
Any spaces discovered by the resource but not present in from
should emit
versions from the very first version.
For each space and associated version in from
, the resource should emit all
versions that appear after the given version (not including the given
version).
If a space or given version in from
is no longer present (in the case of git push -f
or branch deletion), the resource should emit a reset
event for the
space. If the space is still there, but the verion was gone, it should follow
the reset
event with all versions detected from the beginning, as if the
from
value was never specified.
The resource should determine a "default space", if any. Having a default space
is useful for things like Git repos which have a default branch, or version
spaces (e.g. 1.8
, 2.0
) which can point to the latest version line by
default. If there is no default space, the user must specify it explicitly in
the pipeline, either by configuring one on the resource (default_space: foo
)
or on every get
step using the resource (spaces: [foo]
).
Given the following request on stdin
:
{
"config": {
"uri": "https://github.com/concourse/concourse"
},
"from": {
"master": {"ref": "abc123"},
"feature/foo": {"ref":"def456"},
"feature/bar": {"ref":"987cia"}
},
"response_path": "/tmp/check-response.json"
}
If the feature/foo
branch has new commits, master
is the default branch and
has no new commits, and feature/bar
has been push -f
ed, you may see
something like the following in /tmp/check-response.json
:
{"action":"discovered","space":"feature/foo","version":{"ref":"abcdf8"},"metadata":[{"name":"message","value":"fix thing"}]}
{"action":"reset","space":"feature/bar"}
{"action":"discovered","space":"feature/bar","version":{"ref":"abcde0"},"metadata":[{"name":"message","value":"initial commit"}]}
{"action":"discovered","space":"feature/bar","version":{"ref":"abcde1"},"metadata":[{"name":"message","value":"add readme"}]}
{"action":"default_space","space":"master"}
{"action":"discovered","space":"feature/foo","version":{"ref":"abcdf9"},"metadata":[{"name":"message","value":"fix thing even more"}]}
{"action":"discovered","space":"feature/bar","version":{"ref":"abcde2"},"metadata":[{"name":"message","value":"finish the feature"}]}
A few things to note:
-
A
reset
event is emitted immediately upon detecting that the given version forfeature/bar
(987cia
) is no longer available, followed by adiscovered
event for every commit going back to the initial commit on the branch. -
No versions are emitted for
master
, because it's already up to date (abc123
is the latest commit). -
The versions detected for
feature/foo
may appear between events forfeature/bar
, as they're for unrelated spaces. The order only matters within the space.
The get
command will be invoked with the following JSON structure on stdin
:
type GetRequest struct {
Config Config `json:"config"`
Space Space `json:"space"`
Version Version `json:"version"`
}
The command will be invoked with a completely empty working directory. The
command should populate this directory with the requested bits. The git
resource, for example, would clone directly into the working directory.
If the requested version is unavailable, the command should exit nonzero.
No response is expected.
Anything printed to stdout
and stderr
will propagate to the build logs.
The put
command will be invoked with the following JSON structure on stdin
:
type PutRequest struct {
Config Config `json:"config"`
ResponsePath string `json:"response_path"`
}
The command will be invoked with all of the build plan's artifacts present in
the working directory, each as ./(artifact name)
.
The put
script responds by writing JSON objects ("events") to a file
specified by response_path
, just like check
. Each JSON object has an
action
and a different set of fields based on the action.
Anything printed to stdout
and stderr
will propagate to the build logs.
The following event types may be emitted by put
:
-
created
: Emitted when the resource has created (perhaps idempotently) a version. The version will be recorded as an output of the build.Versions produced by
put
will not be directly inserted into the resource's version history in the pipeline, as they were with v1 resources. This enables one-off versions to be created and fetched within a build without disrupting the normal detection of resource versions across theRequired fields for this event:
space
: The space the version is in.version
: The version object.metadata
: A list of JSON objects withname
andvalue
, shown to the user. Note that this is return by bothput
andcheck
, because there's a chance thatput
produces a version that wouldn't normally be discovered bycheck
.
-
deleted
: Emitted when a version has been deleted. The version record will remain in the database for archival purposes, but it will no longer be a candidate for any builds.Required fields for this event:
space
: The space the version is in.version
: The version object.
Because the space is included on each event, put
allows new spaces to be
generated dynamically based on params and/or the bits in its working directory
and propagated to the rest of the pipeline.
I've started cooking up new resources using this interface. I've left TODO
s
for parts that need more thinking or discussion. Please leave comments!
This resource models the original git
resource. It represents each branch as a space.
This is a whole new semver resource intended to replace the original semver
resource with a better model that supports concurrent version lines (i.e.
supporting multiple major/minor releases with patches). It does this by managing
tags in an existing Git repository.
This resource models the original s3
resource. Only regex versions were
implemented, each space corresponds to a major.minor version. For example, 1.2.0
and 1.2.1 is the same space but 1.3.0 is a different space. Single numbers are
also supported with default minor of 0. The default space is set to the latest
minor version.
TODO:
- Pull Requests
- Feature branches
- Build matrixes
- Generating branches (and propagating them downstream)
- Semver artifacts
- Fanning out against multiple IaaSes
- Pool resource?
- BOSH deploys
-
Add an
info
script which returns a JSON object indicating the supported interfaces, their protocol versions, and any other interface-specific meta-configuration (for example, which commands to execute for the interface's hooks). -
The first supported interface will be called
artifacts
, and its version will start at2.0
as it's really the next iteration of the existing "resources" concept, but with a more specific name. -
There are no more hardcoded paths (
/opt/resource/X
) - instead there's the singleinfo
entrypoint, which is run in the container's working directory. This is more platform-agnostic.
-
Remove the distinction between
source
andparams
; resources will receive a singleconfig
. The distinction will remain in the pipeline. This makes it easier to implement a resource without planning ahead for dynamic vs. static usage patterns. This will become more powerful if concourse/concourse#684 is implemented. -
Change
check
to run against all spaces. It will be given a mapping of each space to its current latest version, and return the set of all spaces, along with any new versions in each space.This is all done as one batch call so that resources can decide how to efficiently perform the check. It also keeps the container overhead down to one per resource, rather than one per space.
-
Remove the implicit
get
after everyput
, now requiring the pipeline to explicitly configure aget
field on the same step. This is necessary now thatput
can potentially perform an operation resulting solely indeleted
events, in which case there is nothing to fetch.This has also been requested by users for quite a while, for the sake of optimizing jobs that have no need for the implicit
get
. -
Change
put
to emit a sequence of created versions, rather than just one.Technically the
git
resource may push many commits, so returning more than one version is necessary to track them all as outputs of a build. This could also support batch creation.To ensure
check
is the source of truth for ordering, the versions emitted byput
are not directly inserted into the database. Instead, they are simply recorded as outputs of the build. The order does matter, however - if a user configures aget
on theput
step, the last version emitted will be fetched. For this reason they should be emitted in chronological order. -
Change
put
to additionally return a sequence of deleted versions.There has long been a call for a batch
delete
ordestroy
action. Adding this toput
alongside the set of created versions allowsput
to become a general idempotent side-effect performer, rather than implying that each resource must support a separatedelete
action. -
Change
get
to always run against a particular space, given by the request payload. -
Change
check
to include metadata for each version. Changeget
to no longer return it.This way metadata is always immediately available, which could enable us to have a richer UI for the version history page.
The original thought was that metadata collection may be expensive, but so far we haven't seen that to be the case.
-
Change
get
script to no longer return a version, since it's always given one now. As a result,get
no longer has a response; it just succeeds or fails. -
Change
get
andput
to run with the bits as their working directory, rather than taking the path as an argument. This was something people would trip up on when implementing a resource. -
Change
check
andput
to write its JSON response to a specified file, rather thanstdout
, so that we don't have to be attached to process its response.This is one of the few ways a build can error after the ATC reattaches (
unexpected end of JSON
). With it written to a file, we can just try to read the file when we re-attach after seeing that the process exited. This also frees upstdout
/stderr
for normal logging, which has been an occasional pitfall during resource development/debugging.Another motivation for this is safety: with
check
emitting a ton of data, there is danger in Garden losing chunks of the output due to a slow consumer. Writing to a file circumvents this issue.
With v1 resources, every put
in a Concourse pipeline implied a get
of the
version that was created. With v2, the get
will be made opt-in. This has been
a long-time ask, and one objective reason to make it opt-in is that Concourse
can't know ahead of time that there will even be anything to get
- for
example, the put
could emit only deleted
events.
So, to get
the latest version that was produced by the put
, you would
configure something like:
- put: my-resource
get: my-created-resource
- task: use-my-created-resource
The value for the get
field is the name under which the artifact will be
saved (just like get
steps). When specified, the last version emitted will be
fetched (from whichever space it was in).
Resources that really only have a "current state", such as deployments, can now represent their state more clearly because old versions that are no longer there will be marked 'deleted'.
This can be done by representing each non-linear version in a separate space. For example, generated code could be pushed to a generated (but deterministic) branch name, and that space could then be passed along.
Now that put
doesn't directly modify the resource's version history, it can
be used to provide explicitly versioned 'variants' of original versions without
doubling up the version history. One use case for this is pull-requests: you
may want a build to pull in one resource for the PR itself, another resource
for the base branch of the upstream reap, and then put
to produce a
"combined" version of the two, representing the PR merged into the upstream
repo:
jobs:
- name: run-pr
plan:
- get: concourse-pr # pr: 123, ref: deadbeef
trigger: true
- get: concourse # ref: abcdef
- put: concourse-pr
get: merged-pr
params:
merge_base: concourse
status: pending
# the `put` will learns base ref from `concourse` input and param, and emit
# a 'created' event with the following version:
#
# pr: 123, ref: deadbeef, base: abcdef
#
# the `get` will then run with that version and knows to merge onto the
# given base ref
- task: unit
# uses 'merged-pr' as an input
Initially there was a limitation that put
could only emit versions pertaining
to a single space. This was to prevent ambiguity with "get
after put
" -
which space would the get
fetch from? We loosened this constraint because it
felt somewhat arbitrary, as the protocol allows it easily, and recording
outputs and marking versions as deleted across spaces isn't any harder than
with a single space.
To loosen the constraint we've instead constrained the get
to only fetch the
last version, from whichever space it was in. But are there any good examples
of this being useful, or have we just moved the arbitrary restriction
elsewhere? (At least we've moved it out of the resource interface - technically
this is a pipeline concern, not a resource interface concern.)
Would users want to fetch multiple spaces that were created? Would they want to
do this statically (at pipeline definition time) or dynamically (at runtime)?
Static would be relatively easily as the build plan would just result in
multiple get
steps, but dynamic would run into the same challenges as with
dynamic build plan
generation. However users
could always just separate it into a different job spanning the spaces
dynamically with wildcards.
Here's a mockup for static configuration:
- put: foo
params: bar
get: {artifact-name-a: space-a, artifact-name-b: space-b}
...but is that useful?
This is really in need of a use case to define it further, but for now the constraint has been lifted from the resource interface, and it's up to the rest of Concourse's pipeline mechanics to determine what's possible from there.
Can we reduce the `check` overhead?
With spaces there will be more `check`s than ever. Right now, there's one
container per recurring `check`. Can we reduce the container overhead here by
requiring that resource `check`s be side-effect free and able to run in
parallel?
There may be substantial security implications for this.
This is now done as one big `check` across all spaces, run in a single container. Resources can choose how to perform this efficiently and safely. This may mean GraphQL requests or just iterating over local shared state in series. Even in the worst-case, where no parallelism is involved, it will at least consume only one container.
Is `destroy` general enough to be a part of the interface?
It may be the case that most resources cannot easily support `destroy`. One
example is the `git` resource. It doesn't really make sense to `destroy` a
commit. Even if it did (`push -f`?), it's a kind of weird workflow to support
out of the box.
Could we instead just have `put` and ensure that we `check` in such a way that
deleted versions are automatically noticed? What would the overhead of this
be? This only works if the versions are "chained", as with the `git` case.
Decided against introducing `destroy` in favor of having `put` return two sets for each space: versions created and versions deleted. This generalizes `put` into an idempotent versioned artifact side effect performer.
Should `put` be given a space or return the space?
The verb `PUT` in HTTP implies an idempotent action against a given resource. So
it's intuitive that the `put` verb here would do the same.
However, many of today's usage of `put` would be against a dynamically
determined space. For example, most semver workflows involve `put`ing with the
version determined by a file (often coming from the `semver` resource). So the
space isn't known statically at pipeline configuration time.
What's more, the resulting space for a semver push would only be `MAJOR.MINOR`,
excluding the final patch segment. This is annoying to have to explicitly
configure in your build.
If we instead have `put` return both the space and the versions, this would be a
lot simpler.
Answered this at the same time as having `put` return a set of deleted versions. It'll return multiple spaces and versions created/deleted for them.
Now that we're going to be collecting all versions of every resource, we should be careful not to be scanning the entire table all the time, and even make an effort to share data when possible. We have implemented this with concourse/concourse#2386.