status | title | creation-date | last-updated | authors | |||
---|---|---|---|---|---|---|---|
implemented |
Allow custom task to be embedded in pipeline |
2021-03-18 |
2021-05-26 |
|
- Summary
- Motivation
- Requirements
- Proposal
- Design Details
- Test Plan
- Design Evaluation
- Drawbacks
- Alternatives
- Infrastructure Needed (optional)
- Upgrade & Migration Strategy (optional)
- Implementation Pull request(s)
- References (optional)
Tektoncd/Pipeline currently allows custom task to be referenced in pipeline
resource specification file using taskRef
.
This TEP discusses the various aspects of embedding the custom task in the TaskSpec
for the Tekton Pipeline CRD and RunSpec
for the Tekton Run CRD. Just as a regular task, can be either referenced
or embedded in the pipelineRun
, after implementation of this TEP, a similar support will be available for custom task controller as well.
A custom task reference needs to be submitted to kubernetes along with
the submission of the Tektoncd/pipeline.
To run the pipeline, custom task resource object creation is submitted as a separate request to Kubernetes.
If multiple custom task resource objects are created with the same name, to both Kubernetes and Tektoncd/Pipeline,
they will be treated as the same task, this behavior can have unintended
consequences when Tektoncd/Pipeline gets used as a backend with multiple users.
This problem becomes even greater when new users follow documents such as
Get started
where each user may end up with same name for task and pipeline. In this environment
multiple users will step on each other's toes, and produce unintended results.
Another motivation for having this TEP, is reduction in number of API calls to get all the pipeline information.
A case in point, in Kubeflow Pipeline (KFP), we need all the templates and task spec live in each pipeline. Currently,
having all the custom task templates living in the Kubernetes namespace scope means that
we have to make multiple API calls to Kubernetes in order to get all the pipeline
information to render in our API/UI. For example, when we create a pipelineRun
with custom
tasks, the KFP client first needs to make multiple API calls to Kubernetes to create all the
custom task CRDs on the same namespace before creating the pipelineRun
. Having all the spec
inside a single pipelineRun
can simplify task/pipeline submission for the KFP client and reduce the
number of API calls to the Kubernetes cluster.
Currently TektonCD/Pipeline supports task specifications to be embedded in
a pipeline for regular task, but not for custom task. If Tektoncd/Pipeline
also allows a custom task specification to be embedded in a pipeline specification
then the behavior will be unified with regular task, retaining the existing behavior of taskRef
.
Most importantly, embedding of spec avoids the issues related to naming conflict, when multiple users in the
same namespace create resource. Related issue
tektoncd/pipeline#3682
- Allow custom tasks to be embedded in a pipeline specification.
- Custom
taskSpec
should be submitted as part of therunSpec
. - Document, general advice on validation/verification of custom task, to the custom task controller developers.
- Custom task controllers are to be developed by other parties. Custom task specification validation by Tektoncd/Pipeline webhooks.
Use cases from Kubeflow Pipeline (KFP), where tektoncd
is used as a backend for running pipelines:
- KFP compiler can put all the information in one
pipelineRun
object. Then, KFP client doesn't need to create any Kubernetes resource before running thepipelineRun
. - KFP doesn't manage the lifecycle of associated custom task resource objects for each pipeline. Since many custom task resource objects are namespace scope, multiple users in the same namespace will have conflicts when creating the custom task resource objects with the same name but with different specs.
- The Tekton controller is responsible for adding the custom task spec to the Run spec. Validation of the custom task is delegated to the custom controller.
Add support for Run.RunSpec.Spec
.
Currently, Run.RunSpec.Spec
is not supported and there are validations across the
codebase to ensure, only Run.RunSpec.Ref
is specified. As part of this TEP, in addition
to adding support for Run.RunSpec.Spec
the validations will be changed to support
"One of Run.RunSpec.Spec
or Run.RunSpec.Ref
" only and not both as part of a single
API request to kubernetes.
Introducing a new type v1alpha1.EmbeddedRunSpec
// EmbeddedRunSpec allows custom task definitions to be embedded
type EmbeddedRunSpec struct {
runtime.TypeMeta `json:",inline"`
// +optional
Metadata v1beta1.PipelineTaskMetadata `json:"metadata,omitempty"`
// Spec is a specification of a custom task
// +optional
Spec runtime.RawExtension `json:"spec,omitempty"`
}
Structure of RunSpec
after adding the field Spec
of type EmbeddedRunSpec
,
// RunSpec defines the desired state of Run
type RunSpec struct {
// +optional
Ref *TaskRef `json:"ref,omitempty"`
// Spec is a specification of a custom task
// +optional
Spec *EmbeddedRunSpec `json:"spec,omitempty"`
// +optional
Params []v1beta1.Param `json:"params,omitempty"`
// Used for cancelling a run (and maybe more later on)
// +optional
Status RunSpecStatus `json:"status,omitempty"`
// +optional
ServiceAccountName string `json:"serviceAccountName"`
// PodTemplate holds pod specific configuration
// +optional
PodTemplate *PodTemplate `json:"podTemplate,omitempty"`
// Workspaces is a list of WorkspaceBindings from volumes to workspaces.
// +optional
Workspaces []v1beta1.WorkspaceBinding `json:"workspaces,omitempty"`
}
An embedded task will accept new fields i.e. Spec
with type
runtime.RawExtension
and ApiVersion
and Kind
fields of type string (as part of
runtime.TypeMeta
) :
type EmbeddedTask struct {
// +optional
runtime.TypeMeta `json:",inline,omitempty"`
// +optional
Spec runtime.RawExtension `json:"spec,omitempty"`
// +optional
Metadata PipelineTaskMetadata `json:"metadata,omitempty"`
// TaskSpec is a specification of a task
// +optional
TaskSpec `json:",inline,omitempty"`
}
An example Run
spec based on Tektoncd/experimental/task-loop controller, will look like:
apiVersion: tekton.dev/v1alpha1
kind: Run
metadata:
name: simpletasklooprun
spec:
params:
- name: word
value:
- jump
- land
- roll
- name: suffix
value: ing
spec:
apiVersion: custom.tekton.dev/v1alpha1
kind: TaskLoop
spec:
# Task to run (inline taskSpec also works)
taskRef:
name: simpletask
# Parameter that contains the values to iterate
iterateParam: word
# Timeout (defaults to global default timeout, usually 1h00m; use "0" for no timeout)
timeout: 60s
# Retries for task failure
retries: 2
Another example based on PipelineRun
spec, will look like:
Note that, spec.pipelineSpec.tasks.taskSpec.spec
is holding the custom task spec.
apiVersion: tekton.dev/v1beta1
kind: PipelineRun
metadata:
name: pr-loop-example
spec:
pipelineSpec:
tasks:
- name: first-task
taskSpec:
steps:
- name: echo
image: ubuntu
imagePullPolicy: IfNotPresent
script: |
#!/usr/bin/env bash
echo "I am the first task before the loop task"
- name: loop-task
runAfter:
- first-task
params:
- name: message
value:
- I am the first one
- I am the second one
- I am the third one
taskSpec:
apiVersion: custom.tekton.dev/v1alpha1
kind: PipelineLoop
spec:
iterateParam: message
pipelineSpec:
params:
- name: message
type: string
tasks:
- name: echo-loop-task
params:
- name: message
value: $(params.message)
taskSpec:
params:
- name: message
type: string
steps:
- name: echo
image: ubuntu
imagePullPolicy: IfNotPresent
script: |
#!/usr/bin/env bash
echo "$(params.message)"
- name: last-task
runAfter:
- loop-task
taskSpec:
steps:
- name: echo
image: ubuntu
imagePullPolicy: IfNotPresent
script: |
#!/usr/bin/env bash
echo "I am the last task after the loop task"
Tektoncd/pipeline
can only validate the structure and fields it knows about, validation of the
custom task spec field(s) is delegated to the custom task controller.
A custom controller may still choose to not support a Spec
based Run
or
PipelineRun
specification. This can be done by implementing validations at the custom controller end.
If the custom controller did not respond in any of the ways i.e. either validation errors or reconcile CRD,
then, a PipelineRun
or a Run
will wait until the timeout and mark the status as Failed
.
What is the fate of an existing custom controller developed prior to the implementation of this TEP. If the
custom controller implemented a validation for missing a Ref
, then the PipelineRun
or Run
missing a Ref
will fail immediately with configured error and if however, no validation was
implemented for missing a Ref
, then it can even lead to nil dereference errors
or have the same fate
as that of a custom controller who does not respond for missing a Spec
or a Ref
.
A poorly implemented custom task controller might neglect validation or manifest erroneous behaviour beyond
the control of tektoncd/pipeline
. This is true of any custom task
implementation whether Spec
or Ref
.
With the embedded taskSpec
for the custom task, all the Tekton clients
can create a pipeline or pipelineRun
using a single API call to the Kubernetes.
Any downstream systems that employ tektoncd e.g. Kubeflow pipelines, will not be
managing lifecycle of all the custom task resource objects (e.g. generate unique names)
and their versioning.
It is natural for a user to follow ways such as defining the
PodTemplateSpec
as the Kubernetes pod definition in
Kubernetes Deployment,
ReplicaSet
, and StatefulSet
.
Tektoncd/Pipeline with custom tasks embedded will offer a similar/familiar experience.
Performance improvement is a consequence of reduction in number of API request(s) to create custom resource(s) accompanying a pipeline. In pipelines, where the number of custom task resource objects are large, this can make a huge difference in performance improvement.
For the end users, trying to render the custom task resource details on the UI dashboard, can be a much smoother experience if all the requests could be fetched in fewer API request(s).
The actual code changes needed to implement this TEP, are very minimal.
Broad categories are:
-
Add the relevant APIs. Already covered in Proposal section.
-
Change validation logic to accept the newly added API fields. Currently
tecktoncd/pipeline
will reject any request forRun
, which does not include aRun.RunSpec.Ref
. So this validation is now changed to either one ofRef
orSpec
must be present, but not both.Next, whether it is a
Ref
or aSpec
, validation logic will ensure, they have non-empty values for,APIVersion
andKind
.Lastly document advice for downstream custom controllers to implement their own validation logic. This aspect is covered in full detail, in Upgrade & Migration Strategy section of this TEP.
This TEP does not change the existing flow of creation of Run
object, it updates
the Run object with the content of RunSpec.Spec
by marshalling the field
Spec runtime.RawExtension
to json and embed in the spec, before creating the
Run object.
We can reuse the current custom task e2e tests which simulates a custom task controller
updating Run
status. Then, verify whether the controller can handle the custom task
taskSpec
as well or not.
Before the implementation of this TEP, i.e. without the support for embedding a
custom task spec in the PipelineRun
resource, a user has to create multiple API
requests to the Apiserver. Next, he has to ensure unique names, to avoid conflict
in someone else's created custom task resource object.
Embedding of custom task spec avoids the problems related to name collisions and also improves performance by reducing the number of API requests to create custom task resource objects. The performance benefit, of reducing the number of API requests, is more evident when using web-ui based dashboard to display, pipeline details (e.g. in Kubeflow Pipelines with tekton as backend).
Lastly, it looks aesthetically nicer and coherent with existing regular task, with all the custom task spec using fewer lines of yaml and all present in one place.
Use v1beta1.EmbeddedTask
as RunSpec.Spec
so that we
don't have to introduce a new embedded Spec type for runs.
Cons:
- brings some
PipelineTask
-specific fields (like PipelineResources) that don't have a use case in Runs yet.
-
Existing custom controller need to upgrade their validation logic:
Rationale: Previously, there was only one possibility for the structure of
Run
objects, i.e. they had the path asRun.RunSpec.Ref
. A custom controller may do fine, even without validating the input request(s) that misses aRef
. Because, this was already validated bytektoncd/pipeline
. After the implementation of this TEP, this is no longer the case, aRun.RunSpec
may either contain aRef
or aSpec
. So a request with aSpec
, to a controller which does not have proper validation for missing aRef
, and does not yet support aSpec
, may be rendered in an unstable state e.g. due tonil dereference errors
or fail due to timeout. -
Support
spec
ortaskSpec
in the existing custom controller:With implementation of this tep, users can supply custom task spec embedded in a
PipelineRun
orRun
. The existing custom controller need to upgrade, to provide support. -
Unmarshalling the json of custom task object embedded as
Spec
:Run.RunSpec.Spec
objects are marshalled as binary by usingjson.Marshal
wherejson
is imported fromencoding/json
library of golang. So the custom controller may unmarshall these objects by using the correspondingunmarshall
function as,json.Unmarshal(run.Spec.Spec.Spec.Raw, &customObjectSpec)
. In the future, a custom task SDK will do a better job of handling it, and making it easier for the developer to work on custom task controller. TODO: Add a reference to an example custom task controller e.g.TaskLoop
, once the changes are merged.