Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add provenance-related field in ResolutionRequest.Status #5529

Closed
chuangw6 opened this issue Sep 20, 2022 · 4 comments · Fixed by #5551
Closed

Add provenance-related field in ResolutionRequest.Status #5529

chuangw6 opened this issue Sep 20, 2022 · 4 comments · Fixed by #5551
Labels
kind/feature Categorizes issue or PR as related to a new feature.

Comments

@chuangw6
Copy link
Member

chuangw6 commented Sep 20, 2022

Feature request

Remote ResolutionRequestStatusFields currently only has a Data field that stores the string representation of the resolved content. It would be great to add extra structured field SourceRef into ResolutionRequestStatus whose type is the configSource that SLSA defines.

current

type ResolutionRequestStatusFields struct {
	// Data is a string representation of the resolved content
	// of the requested resource in-lined into the ResolutionRequest
	// object.
	Data string `json:"data"`
}

desired

type ResolutionRequestStatusFields struct {
	// Data is a string representation of the resolved content
	// of the requested resource in-lined into the ResolutionRequest
	// object.
	Data string `json:"data"`
        // SourceRef
        SourceRef intoto.ConfigSource 
}

Use case

As discussed in #5522, we need to pass the source ref information to Run status so that Chains can pick up the information and record the link back to origins in the SLSA provenance.

Currently, we are trying to do this through annotations i.e. for git resolver #5397. The problem with that is annotations are not structured. This might add confusions about what those annotations are and make it hard to find how it interact with existing annotations in the Run object.

@chuangw6 chuangw6 added the kind/feature Categorizes issue or PR as related to a new feature. label Sep 20, 2022
@chuangw6
Copy link
Member Author

chuangw6 commented Sep 20, 2022

Had this discussion with @wlynch in today's S3C meeting.

@abayer Please comment here if you have any questions/concerns. Happy to take on this if we are happy with this. Thanks!

@abayer
Copy link
Contributor

abayer commented Sep 20, 2022

So there's the additional problem that we can't guarantee structured resolution source information - that's going to depend on the resolver implementation itself. We can control that for git, bundles, cluster, and hub, but not for any third-party resolvers that are written. I'm also a bit wary of using a struct because even just with those four resolvers, we've got 5+ from git (I forget off the top of my head if more are added in #5397), 3 in bundles, and 2 in cluster (hub currently doesn't have any annotations), so that would be a pretty dang big struct. It really feels like we're just going to end up approximating a map anyway.

@wlynch
Copy link
Member

wlynch commented Sep 20, 2022

It just needs to be structured enough that we can report back in a general way to record what was fetched in build provenance. https://github.com/in-toto/attestation#provenance-example has an example of what this should look like.

A pURL + digest identifier would do the trick. As @chuangw6 mentioned, Intoto ConfigSource is what we're ultimately looking to populate - the schema is pretty flexible so resolvers can self-determine what the format of the pURL is + what revision types they want to support.

@chuangw6
Copy link
Member Author

chuangw6 commented Sep 21, 2022

I'm also a bit wary of using a struct because even just with those four resolvers, we've got 5+ from git (I forget off the top of my head if more are added in #5397), 3 in bundles, and 2 in cluster (hub currently doesn't have any annotations), so that would be a pretty dang big struct. It really feels like we're just going to end up approximating a map anyway.

If we start using the structured intoto configSource, I think we wouldn't need to create those annotations at all. Passing the structured SourceRef from ResolutionRequest to pipeline reconciler should be sufficient, which is essentially what Chains needs for the provenance.

type ResolutionRequestStatusFields struct {
	// Data is a string representation of the resolved content
	// of the requested resource in-lined into the ResolutionRequest
	// object.
	Data string `json:"data"`
        // SourceRef
        SourceRef intoto.ConfigSource 
}

chuangw6 added a commit to chuangw6/pipeline that referenced this issue Sep 23, 2022
Related to
- tektoncd#5529
- tektoncd#5397

Before:
The customized status of ResolutionRequest only contains the
resolved data.

Now:
The resolved source reference of the remote data is also added
to the ResolutionRequest.status. It is recorded in a structured way
using the standard SLSA ConfigSource struct.

Why?
Recently there is a clear requirement that the remote source
information of the remote data should be recorded in the provenance to
link back to its origin including the resolved the commit sha when users
only provide the branch/tag name for the resolver. Without this PR, the
only way to achieve this is to pass the resolved source information through
annotations, which has a couple of the drawbacks i.e. unstructured data,
hard to maintain and to make changes in future etc. That's where this PR
comes in to solve the problem.

Signed-off-by: Chuang Wang <chuangw@google.com>
@chuangw6 chuangw6 changed the title ResolutionRequestStatus passes back source ref data in structured way Add provenance-related field in ResolutionRequest.Status Sep 23, 2022
chuangw6 added a commit to chuangw6/pipeline that referenced this issue Sep 23, 2022
Related to
- tektoncd#5529
- tektoncd#5397

Before:
The customized status of ResolutionRequest only contains the
resolved data.

Now:
The resolved source reference of the remote data is also added
to the ResolutionRequest.status. It is recorded in a structured way
using the standard SLSA ConfigSource struct.

Why?
Recently there is a clear requirement that the remote source
information of the remote data should be recorded in the provenance to
link back to its origin including the resolved the commit sha when users
only provide the branch/tag name for the resolver. Without this PR, the
only way to achieve this is to pass the resolved source information through
annotations, which has a couple of the drawbacks i.e. unstructured data,
hard to maintain and to make changes in future etc. That's where this PR
comes in to solve the problem.

Signed-off-by: Chuang Wang <chuangw@google.com>
chuangw6 added a commit to chuangw6/pipeline that referenced this issue Sep 26, 2022
Related to
- tektoncd#5529
- tektoncd#5397

Before:
The customized status of ResolutionRequest only contains the
resolved data.

Now:
The resolved source reference of the remote data is also added
to the ResolutionRequest.status. It is recorded in a structured way
using the standard SLSA ConfigSource struct.

Why?
Recently there is a clear requirement that the remote source
information of the remote data should be recorded in the provenance to
link back to its origin including the resolved the commit sha when users
only provide the branch/tag name for the resolver. Without this PR, the
only way to achieve this is to pass the resolved source information through
annotations, which has a couple of the drawbacks i.e. unstructured data,
hard to maintain and to make changes in future etc. That's where this PR
comes in to solve the problem.

Signed-off-by: Chuang Wang <chuangw@google.com>
chuangw6 added a commit to chuangw6/pipeline that referenced this issue Sep 27, 2022
Related to
- tektoncd#5529
- tektoncd#5397

Before:
The `ResolutionRequestStatusFields` only has the `Data` that is a string
representation of the resolved content.

Now:
A new field called `Source` is now introduced to the `ResolutionRequestStatusFields`
to record the source information of the remote data in a structured way
using the standard SLSA ConfigSource struct.

Why?
Recently there is a clear requirement that the remote source
information of the remote data should be recorded in the provenance to
link back to its origin including the resolved the commit sha when users
only provide the branch/tag name for the resolver. Without this PR, the
only way to achieve this is to pass the resolved source information through
annotations, which has a couple of the drawbacks i.e. unstructured data,
hard to maintain and to make changes in future etc. That's where this PR
comes in to solve the problem.

Signed-off-by: Chuang Wang <chuangw@google.com>
chuangw6 added a commit to chuangw6/pipeline that referenced this issue Sep 28, 2022
Related to
- tektoncd#5529
- tektoncd#5397

Before:
The `ResolutionRequestStatusFields` only has the `Data` that is a string
representation of the resolved content.

Now:
A new field called `Source` is now introduced to the `ResolutionRequestStatusFields`
to record the source information of the remote data in a structured way
using the standard SLSA ConfigSource struct.

Why?
Recently there is a clear requirement that the remote source
information of the remote data should be recorded in the provenance to
link back to its origin including the resolved the commit sha when users
only provide the branch/tag name for the resolver. Without this PR, the
only way to achieve this is to pass the resolved source information through
annotations, which has a couple of the drawbacks i.e. unstructured data,
hard to maintain and to make changes in future etc. That's where this PR
comes in to solve the problem.

Signed-off-by: Chuang Wang <chuangw@google.com>
chuangw6 added a commit to chuangw6/pipeline that referenced this issue Sep 28, 2022
Related to
- tektoncd#5529
- tektoncd#5397

Before:
The `ResolutionRequestStatusFields` only has the `Data` that is a string
representation of the resolved content.

Now:
A new field called `Source` is now introduced to the `ResolutionRequestStatusFields`
to record the source information of the remote data in a structured way
using the standard SLSA ConfigSource struct.

Why?
Recently there is a clear requirement that the remote source
information of the remote data should be recorded in the provenance to
link back to its origin including the resolved the commit sha when users
only provide the branch/tag name for the resolver. Without this PR, the
only way to achieve this is to pass the resolved source information through
annotations, which has a couple of the drawbacks i.e. unstructured data,
hard to maintain and to make changes in future etc. That's where this PR
comes in to solve the problem.

Signed-off-by: Chuang Wang <chuangw@google.com>
chuangw6 added a commit to chuangw6/pipeline that referenced this issue Sep 28, 2022
Related to
- tektoncd#5529
- tektoncd#5397

Before:
The `ResolutionRequestStatusFields` only has the `Data` that is a string
representation of the resolved content.

Now:
A new field called `Source` is now introduced to the `ResolutionRequestStatusFields`
to record the source information of the remote data in a structured way
using the standard SLSA ConfigSource struct.

Why?
Recently there is a clear requirement that the remote source
information of the remote data should be recorded in the provenance to
link back to its origin including the resolved the commit sha when users
only provide the branch/tag name for the resolver. Without this PR, the
only way to achieve this is to pass the resolved source information through
annotations, which has a couple of the drawbacks i.e. unstructured data,
hard to maintain and to make changes in future etc. That's where this PR
comes in to solve the problem.

Signed-off-by: Chuang Wang <chuangw@google.com>
chuangw6 added a commit to chuangw6/pipeline that referenced this issue Sep 28, 2022
Related to
- tektoncd#5529
- tektoncd#5397

Before:
The `ResolutionRequestStatusFields` only has the `Data` that is a string
representation of the resolved content.

Now:
A new field called `Source` is now introduced to the `ResolutionRequestStatusFields`
to record the source information of the remote data in a structured way
using the standard SLSA ConfigSource struct.

Why?
Recently there is a clear requirement that the remote source
information of the remote data should be recorded in the provenance to
link back to its origin including the resolved the commit sha when users
only provide the branch/tag name for the resolver. Without this PR, the
only way to achieve this is to pass the resolved source information through
annotations, which has a couple of the drawbacks i.e. unstructured data,
hard to maintain and to make changes in future etc. That's where this PR
comes in to solve the problem.

Signed-off-by: Chuang Wang <chuangw@google.com>
chuangw6 added a commit to chuangw6/pipeline that referenced this issue Sep 28, 2022
Related to
- tektoncd#5529
- tektoncd#5397

Before:
The `ResolutionRequestStatusFields` only has the `Data` that is a string
representation of the resolved content.

Now:
A new field called `Source` is now introduced to the `ResolutionRequestStatusFields`
to record the source information of the remote data in a structured way
using the standard SLSA ConfigSource struct.

Why?
Recently there is a clear requirement that the remote source
information of the remote data should be recorded in the provenance to
link back to its origin including the resolved the commit sha when users
only provide the branch/tag name for the resolver. Without this PR, the
only way to achieve this is to pass the resolved source information through
annotations, which has a couple of the drawbacks i.e. unstructured data,
hard to maintain and to make changes in future etc. That's where this PR
comes in to solve the problem.

Signed-off-by: Chuang Wang <chuangw@google.com>
chuangw6 added a commit to chuangw6/pipeline that referenced this issue Sep 28, 2022
Related to
- tektoncd#5529
- tektoncd#5397

Before:
The `ResolutionRequestStatusFields` only has the `Data` that is a string
representation of the resolved content.

Now:
A new field called `Source` is now introduced to the `ResolutionRequestStatusFields`
to record the source information of the remote data in a structured way
using the standard SLSA ConfigSource struct.

Why?
Recently there is a clear requirement that the remote source
information of the remote data should be recorded in the provenance to
link back to its origin including the resolved the commit sha when users
only provide the branch/tag name for the resolver. Without this PR, the
only way to achieve this is to pass the resolved source information through
annotations, which has a couple of the drawbacks i.e. unstructured data,
hard to maintain and to make changes in future etc. That's where this PR
comes in to solve the problem.

Signed-off-by: Chuang Wang <chuangw@google.com>
chuangw6 added a commit to chuangw6/pipeline that referenced this issue Sep 28, 2022
Related to
- tektoncd#5529
- tektoncd#5397

Before:
The `ResolutionRequestStatusFields` only has the `Data` that is a string
representation of the resolved content.

Now:
A new field called `Source` is now introduced to the `ResolutionRequestStatusFields`
to record the source information of the remote data in a structured way
using the standard SLSA ConfigSource struct.

Why?
Recently there is a clear requirement that the remote source
information of the remote data should be recorded in the provenance to
link back to its origin including the resolved the commit sha when users
only provide the branch/tag name for the resolver. Without this PR, the
only way to achieve this is to pass the resolved source information through
annotations, which has a couple of the drawbacks i.e. unstructured data,
hard to maintain and to make changes in future etc. That's where this PR
comes in to solve the problem.

Signed-off-by: Chuang Wang <chuangw@google.com>
chuangw6 added a commit to chuangw6/pipeline that referenced this issue Sep 28, 2022
Related to
- tektoncd#5529
- tektoncd#5397

Before:
The `ResolutionRequestStatusFields` only has the `Data` that is a string
representation of the resolved content.

Now:
A new field called `Source` is now introduced to the `ResolutionRequestStatusFields`
to record the source information of the remote data in a structured way
using the standard SLSA ConfigSource struct.

Why?
Recently there is a clear requirement that the remote source
information of the remote data should be recorded in the provenance to
link back to its origin including the resolved the commit sha when users
only provide the branch/tag name for the resolver. Without this PR, the
only way to achieve this is to pass the resolved source information through
annotations, which has a couple of the drawbacks i.e. unstructured data,
hard to maintain and to make changes in future etc. That's where this PR
comes in to solve the problem.

Signed-off-by: Chuang Wang <chuangw@google.com>
chuangw6 added a commit to chuangw6/pipeline that referenced this issue Sep 28, 2022
Related to
- tektoncd#5529
- tektoncd#5397

Before:
The `ResolutionRequestStatusFields` only has the `Data` that is a string
representation of the resolved content.

Now:
A new field called `Source` is now introduced to the `ResolutionRequestStatusFields`
to record the source information of the remote data in a structured way
using the standard SLSA ConfigSource struct.

Why?
Recently there is a clear requirement that the remote source
information of the remote data should be recorded in the provenance to
link back to its origin including the resolved the commit sha when users
only provide the branch/tag name for the resolver. Without this PR, the
only way to achieve this is to pass the resolved source information through
annotations, which has a couple of the drawbacks i.e. unstructured data,
hard to maintain and to make changes in future etc. That's where this PR
comes in to solve the problem.

Signed-off-by: Chuang Wang <chuangw@google.com>
chuangw6 added a commit to chuangw6/pipeline that referenced this issue Sep 28, 2022
Related to
- tektoncd#5529
- tektoncd#5397

Before:
The `ResolutionRequestStatusFields` only has the `Data` that is a string
representation of the resolved content.

Now:
A new field called `Source` is now introduced to the `ResolutionRequestStatusFields`
to record the source information of the remote data in a structured way
using the standard SLSA ConfigSource struct.

Why?
Recently there is a clear requirement that the remote source
information of the remote data should be recorded in the provenance to
link back to its origin including the resolved the commit sha when users
only provide the branch/tag name for the resolver. Without this PR, the
only way to achieve this is to pass the resolved source information through
annotations, which has a couple of the drawbacks i.e. unstructured data,
hard to maintain and to make changes in future etc. That's where this PR
comes in to solve the problem.

Signed-off-by: Chuang Wang <chuangw@google.com>
chuangw6 added a commit to chuangw6/pipeline that referenced this issue Sep 28, 2022
Related to
- tektoncd#5529
- tektoncd#5397

Before:
The `ResolutionRequestStatusFields` only has the `Data` that is a string
representation of the resolved content.

Now:
A new field called `Source` is now introduced to the `ResolutionRequestStatusFields`
to record the source information of the remote data in a structured way
using the standard SLSA ConfigSource struct.

Why?
Recently there is a clear requirement that the remote source
information of the remote data should be recorded in the provenance to
link back to its origin including the resolved the commit sha when users
only provide the branch/tag name for the resolver. Without this PR, the
only way to achieve this is to pass the resolved source information through
annotations, which has a couple of the drawbacks i.e. unstructured data,
hard to maintain and to make changes in future etc. That's where this PR
comes in to solve the problem.

Signed-off-by: Chuang Wang <chuangw@google.com>
chuangw6 added a commit to chuangw6/pipeline that referenced this issue Sep 29, 2022
Related to
- tektoncd#5529
- tektoncd#5397

Before:
The `ResolutionRequestStatusFields` only has the `Data` that is a string
representation of the resolved content.

Now:
A new field called `Source` is now introduced to the `ResolutionRequestStatusFields`
to record the source information of the remote data in a structured way
using the standard SLSA ConfigSource struct.

Why?
Recently there is a clear requirement that the remote source
information of the remote data should be recorded in the provenance to
link back to its origin including the resolved the commit sha when users
only provide the branch/tag name for the resolver. Without this PR, the
only way to achieve this is to pass the resolved source information through
annotations, which has a couple of the drawbacks i.e. unstructured data,
hard to maintain and to make changes in future etc. That's where this PR
comes in to solve the problem.

Signed-off-by: Chuang Wang <chuangw@google.com>
chuangw6 added a commit to chuangw6/pipeline that referenced this issue Sep 29, 2022
Related to
- tektoncd#5529
- tektoncd#5397

Before:
The `ResolutionRequestStatusFields` only has the `Data` that is a string
representation of the resolved content.

Now:
A new field called `Source` is now introduced to the `ResolutionRequestStatusFields`
to record the source information of the remote data in a structured way
using the standard SLSA ConfigSource struct.

Why?
Recently there is a clear requirement that the remote source
information of the remote data should be recorded in the provenance to
link back to its origin including the resolved the commit sha when users
only provide the branch/tag name for the resolver. Without this PR, the
only way to achieve this is to pass the resolved source information through
annotations, which has a couple of the drawbacks i.e. unstructured data,
hard to maintain and to make changes in future etc. That's where this PR
comes in to solve the problem.

Signed-off-by: Chuang Wang <chuangw@google.com>
chuangw6 added a commit to chuangw6/pipeline that referenced this issue Sep 29, 2022
Related to
- tektoncd#5529
- tektoncd#5397

Before:
The `ResolutionRequestStatusFields` only has the `Data` that is a string
representation of the resolved content.

Now:
A new field called `Source` is now introduced to the `ResolutionRequestStatusFields`
to record the source information of the remote data in a structured way
using the standard SLSA ConfigSource struct.

Why?
Recently there is a clear requirement that the remote source
information of the remote data should be recorded in the provenance to
link back to its origin including the resolved the commit sha when users
only provide the branch/tag name for the resolver. Without this PR, the
only way to achieve this is to pass the resolved source information through
annotations, which has a couple of the drawbacks i.e. unstructured data,
hard to maintain and to make changes in future etc. That's where this PR
comes in to solve the problem.

Signed-off-by: Chuang Wang <chuangw@google.com>
chuangw6 added a commit to chuangw6/pipeline that referenced this issue Sep 29, 2022
Related to
- tektoncd#5529
- tektoncd#5397

Before:
The `ResolutionRequestStatusFields` only has the `Data` that is a string
representation of the resolved content.

Now:
A new field called `Source` is now introduced to the `ResolutionRequestStatusFields`
to record the source information of the remote data in a structured way
using the standard SLSA ConfigSource struct.

Why?
Recently there is a clear requirement that the remote source
information of the remote data should be recorded in the provenance to
link back to its origin including the resolved the commit sha when users
only provide the branch/tag name for the resolver. Without this PR, the
only way to achieve this is to pass the resolved source information through
annotations, which has a couple of the drawbacks i.e. unstructured data,
hard to maintain and to make changes in future etc. That's where this PR
comes in to solve the problem.

Signed-off-by: Chuang Wang <chuangw@google.com>
tekton-robot pushed a commit that referenced this issue Sep 30, 2022
Related to
- #5529
- #5397

Before:
The `ResolutionRequestStatusFields` only has the `Data` that is a string
representation of the resolved content.

Now:
A new field called `Source` is now introduced to the `ResolutionRequestStatusFields`
to record the source information of the remote data in a structured way
using the standard SLSA ConfigSource struct.

Why?
Recently there is a clear requirement that the remote source
information of the remote data should be recorded in the provenance to
link back to its origin including the resolved the commit sha when users
only provide the branch/tag name for the resolver. Without this PR, the
only way to achieve this is to pass the resolved source information through
annotations, which has a couple of the drawbacks i.e. unstructured data,
hard to maintain and to make changes in future etc. That's where this PR
comes in to solve the problem.

Signed-off-by: Chuang Wang <chuangw@google.com>
Repository owner moved this from Todo to Done in Tekton Community Roadmap Oct 18, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature.
Projects
Status: Done
Development

Successfully merging a pull request may close this issue.

3 participants