Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add provenance-related data field in Run.Status #5550

Closed
chuangw6 opened this issue Sep 23, 2022 · 4 comments · Fixed by #5580
Closed

Add provenance-related data field in Run.Status #5550

chuangw6 opened this issue Sep 23, 2022 · 4 comments · Fixed by #5580
Labels
kind/feature Categorizes issue or PR as related to a new feature.

Comments

@chuangw6
Copy link
Member

chuangw6 commented Sep 23, 2022

Feature request

Related to #5529
Add provenance related data into TaskRun/PipelineRun status to record some authenticated metadata about how a software artifact was built i.e. the sources where remote resource came from.

TaskRunStatusFields will be

type TaskRunStatusFields struct {
  ...
  ProvenanceData *ProvenanceData
  ...
}

PipelineRunStatusFields will be

type PipelineRunStatusFields sruct {
  ...
  ProvenanceData *ProvenanceData
  ...
}

with ProvenanceData struct:

type ProvenanceData struct {
  ConfigSource *intoto.ConfigSource
  // In future, we can add more provenance-related fields here as needed that can 
  // be piped from pipeline side.
}

Use case

Recently there is a clear requirement that the provenance needs to record the remote source information of the remote data in order to link the config file back to its origin. The commit sha for git resolver used at the moment of resolving the remote resource is the important information to record in the provenance when users only provide the branch/tag name for the resolver. Additionally, the url and the entrypoint (a path to a configuration file) are the source info to be recorded as well.

  • To record these data in a structured way, we use SLSA standard ConfigSource struct. And the data is piped from remote resolver's ResolutionRequest's status as proposed in Add provenance-related field in ResolutionRequest.Status #5529.
  • To give us the flexibility to add more provenance-related data into Run.Status, we aim to create the ProvenanceData type that is designed to wrap all the data needed including the ConfigSource.

Without having the structured type in Run.Status and ResolutionRequest.Status, the only way to achieve this is to passing the data through annotations, which has a couple of drawbacks.

  • unstructured data
  • hard to maintain and make changes in future if more provenance-related data is needed to be piped to Run object so that Chains can pick them up and record in the provenance
  • implementation challenges: passing/spreading around annotation maps is not elegant as opposed to a structured&standardized data type.

Data flow

Screen Shot 2022-09-23 at 14 56 28

@chuangw6 chuangw6 added the kind/feature Categorizes issue or PR as related to a new feature. label Sep 23, 2022
@chuangw6
Copy link
Member Author

Inspired from the PR comments and the conversations with @wlynch and @dibyom. Thanks!

cc @abayer @vdemeester @jerop @lbernick.

Please comment in this issue if anyone has any thoughts/concerns/comments on this. Thank you!!!

@wlynch
Copy link
Member

wlynch commented Sep 26, 2022

lgtm! You should sync with @ywluogg - there was a desire to record other types of provenance data for inputs/outputs of a Run that we probably want to sort out (maybe it also goes in this Provenance struct?)

Otherwise, slight bikeshed - ProvenanceData -> Provenance? I don't think the Data suffix adds much.

@ywluogg
Copy link
Contributor

ywluogg commented Sep 29, 2022

The configSource records the build provenance data. There is another set of provenance we can be interested in - for recording the provenance of the artifacts, we are looking for subjects and materials in Statement.

These provenance don't come from resolver, but instead coming from the artifacts' SHAs and URIs. I wonder if this provenance field is also considering including those provenances?

@chuangw6
Copy link
Member Author

Synced with @ywluogg @wlynch @jagathprakash in the Chains WG meeting on Sep 29, 2022. We landed the idea that for now just start with having the ConfigSource field in the Provenance, but we're onboard with adding inputs (a.k.a materials) / outputs (a.k.a materials) mentioned in Tep-109 when we're ready.

Thanks you all for sharing your thoughts in the meeting.

chuangw6 added a commit to chuangw6/pipeline that referenced this issue Oct 3, 2022
Change 1: Add a Provenance field in TaskRun&PipelineRun status that
currently only contains configsource data, but can be extended later to
have more provenance-related fields.

Change 2: Prior, tektoncd#5551 introduced
the ConfigSource to api/resolution alpha & beta package. In this PR, we moved
the ConfigSource to api/pipeline alpha & beta package for the provenance field
to reuse that type (cannot import the api/resolution alpha because of
import cycle).

Why: See the motivation and discussions in tektoncd#5550.
The tldr is that it helps pass provenance-related data in a more structured way
ConfigSource is one example.

Signed-off-by: Chuang Wang <chuangw@google.com>
chuangw6 added a commit to chuangw6/pipeline that referenced this issue Oct 3, 2022
Change 1: Add a Provenance field in TaskRun&PipelineRun status. This field
currently only contains a subfield named `ConfigSource`, but can be extended later to
have more provenance-related fields.

Change 2: Prior, tektoncd#5551 introduced
the ConfigSource to api/resolution alpha & beta package. In this PR, we moved
the ConfigSource to api/pipeline alpha & beta package for the provenance field
to reuse that type (cannot import the api/resolution alpha because of
import cycle).

Why: See the motivation and discussions in tektoncd#5550.
The tldr is that it helps pass provenance-related data in a more structured way
ConfigSource is one example.

Signed-off-by: Chuang Wang <chuangw@google.com>
chuangw6 added a commit to chuangw6/pipeline that referenced this issue Oct 3, 2022
Change 1: Add a Provenance field in TaskRun&PipelineRun status. This field
currently only contains a subfield named `ConfigSource`, but can be extended later to
have more provenance-related fields.

Change 2: Prior, tektoncd#5551 introduced
the ConfigSource to api/resolution alpha & beta package. In this PR, we moved
the ConfigSource to api/pipeline alpha & beta package for the provenance field
to reuse that type (cannot import the api/resolution alpha because of
import cycle).

Why: See the motivation and discussions in tektoncd#5550.
The tldr is that it helps pass provenance-related data in a more structured way
ConfigSource is one example.

Signed-off-by: Chuang Wang <chuangw@google.com>
chuangw6 added a commit to chuangw6/pipeline that referenced this issue Oct 4, 2022
Change 1: Add a Provenance field in TaskRun&PipelineRun status. This field
currently only contains a subfield named `ConfigSource`, but can be extended later to
have more provenance-related fields.

Change 2: Prior, tektoncd#5551 introduced
the ConfigSource to api/resolution alpha & beta package. In this PR, we moved
the ConfigSource to api/pipeline alpha & beta package for the provenance field
to reuse that type (cannot import the api/resolution alpha because of
import cycle).

Why: See the motivation and discussions in tektoncd#5550.
The tldr is that it helps pass provenance-related data in a more structured way
ConfigSource is one example.

Signed-off-by: Chuang Wang <chuangw@google.com>
chuangw6 added a commit to chuangw6/pipeline that referenced this issue Oct 5, 2022
Change 1: Add a Provenance field in TaskRun&PipelineRun status. This field
currently only contains a subfield named `ConfigSource`, but can be extended later to
have more provenance-related fields.

Change 2: Prior, tektoncd#5551 introduced
the ConfigSource to api/resolution alpha & beta package. In this PR, we moved
the ConfigSource to api/pipeline alpha & beta package for the provenance field
to reuse that type (cannot import the api/resolution alpha because of
import cycle).

Why: See the motivation and discussions in tektoncd#5550.
The tldr is that it helps pass provenance-related data in a more structured way
ConfigSource is one example.

Signed-off-by: Chuang Wang <chuangw@google.com>
chuangw6 added a commit to chuangw6/pipeline that referenced this issue Oct 5, 2022
Change 1: Add a Provenance field in TaskRun&PipelineRun status. This field
currently only contains a subfield named `ConfigSource`, but can be extended later to
have more provenance-related fields.

Change 2: Prior, tektoncd#5551 introduced
the ConfigSource to api/resolution alpha & beta package. In this PR, we moved
the ConfigSource to api/pipeline alpha & beta package for the provenance field
to reuse that type (cannot import the api/resolution alpha because of
import cycle).

Why: See the motivation and discussions in tektoncd#5550.
The tldr is that it helps pass provenance-related data in a more structured way
ConfigSource is one example.

Signed-off-by: Chuang Wang <chuangw@google.com>
chuangw6 added a commit to chuangw6/pipeline that referenced this issue Oct 6, 2022
Change 1: Add a Provenance field in TaskRun&PipelineRun status. This field
currently only contains a subfield named `ConfigSource`, but can be extended later to
have more provenance-related fields.

Change 2: Prior, tektoncd#5551 introduced
the ConfigSource to api/resolution alpha & beta package. In this PR, we moved
the ConfigSource to api/pipeline alpha & beta package for the provenance field
to reuse that type (cannot import the api/resolution alpha because of
import cycle).

Why: See the motivation and discussions in tektoncd#5550.
The tldr is that it helps pass provenance-related data in a more structured way
ConfigSource is one example.

Signed-off-by: Chuang Wang <chuangw@google.com>
chuangw6 added a commit to chuangw6/pipeline that referenced this issue Oct 14, 2022
Change 1: Add a Provenance field in TaskRun&PipelineRun status. This field
currently only contains a subfield named `ConfigSource`, but can be extended later to
have more provenance-related fields.

Change 2: Prior, tektoncd#5551 introduced
the ConfigSource to api/resolution alpha & beta package. In this PR, we moved
the ConfigSource to api/pipeline alpha & beta package for the provenance field
to reuse that type (cannot import the api/resolution alpha because of
import cycle).

Why: See the motivation and discussions in tektoncd#5550.
The tldr is that it helps pass provenance-related data in a more structured way
ConfigSource is one example.

Signed-off-by: Chuang Wang <chuangw@google.com>
tekton-robot pushed a commit that referenced this issue Oct 18, 2022
Change 1: Add a Provenance field in TaskRun&PipelineRun status. This field
currently only contains a subfield named `ConfigSource`, but can be extended later to
have more provenance-related fields.

Change 2: Prior, #5551 introduced
the ConfigSource to api/resolution alpha & beta package. In this PR, we moved
the ConfigSource to api/pipeline alpha & beta package for the provenance field
to reuse that type (cannot import the api/resolution alpha because of
import cycle).

Why: See the motivation and discussions in #5550.
The tldr is that it helps pass provenance-related data in a more structured way
ConfigSource is one example.

Signed-off-by: Chuang Wang <chuangw@google.com>
Repository owner moved this from Todo to Done in Tekton Community Roadmap Oct 18, 2022
JeromeJu pushed a commit to JeromeJu/pipeline that referenced this issue Oct 24, 2022
Change 1: Add a Provenance field in TaskRun&PipelineRun status. This field
currently only contains a subfield named `ConfigSource`, but can be extended later to
have more provenance-related fields.

Change 2: Prior, tektoncd#5551 introduced
the ConfigSource to api/resolution alpha & beta package. In this PR, we moved
the ConfigSource to api/pipeline alpha & beta package for the provenance field
to reuse that type (cannot import the api/resolution alpha because of
import cycle).

Why: See the motivation and discussions in tektoncd#5550.
The tldr is that it helps pass provenance-related data in a more structured way
ConfigSource is one example.

Signed-off-by: Chuang Wang <chuangw@google.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature.
Projects
Status: Done
Development

Successfully merging a pull request may close this issue.

3 participants