Skip to content

Commit

Permalink
Larger results using sidecar logs
Browse files Browse the repository at this point in the history
Prior to this, we were extracting results from tasks via the termination messages which had a limit of only 4 KB per pod. If users had many results then the results would need to become smaller to obey the upper limit of 4 KB.

We now run a dedicated sidecar that has access to the results of all the steps. This sidecar prints out the result and its content to stdout. The logs of the sidecar are parsed by the taskrun controller and the results updated instead of termination logs. We set an upper limit on the results to 1KB but users can have as many such results as needed.
  • Loading branch information
chitrangpatel committed Oct 28, 2022
1 parent 6bc53ca commit 06bbc1d
Show file tree
Hide file tree
Showing 20 changed files with 643 additions and 37 deletions.
16 changes: 9 additions & 7 deletions cmd/entrypoint/main.go
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,8 @@ var (
breakpointOnFailure = flag.Bool("breakpoint_on_failure", false, "If specified, expect steps to not skip on failure")
onError = flag.String("on_error", "", "Set to \"continue\" to ignore an error and continue when a container terminates with a non-zero exit code."+
" Set to \"stopAndFail\" to declare a failure with a step error and stop executing the rest of the steps.")
stepMetadataDir = flag.String("step_metadata_dir", "", "If specified, create directory to store the step metadata e.g. /tekton/steps/<step-name>/")
stepMetadataDir = flag.String("step_metadata_dir", "", "If specified, create directory to store the step metadata e.g. /tekton/steps/<step-name>/")
dontSendResultsToTerminationPath = flag.Bool("dont_send_results_to_termination_path", false, "If specified, dont send results to the termination path.")
)

const (
Expand Down Expand Up @@ -142,12 +143,13 @@ func main() {
stdoutPath: *stdoutPath,
stderrPath: *stderrPath,
},
PostWriter: &realPostWriter{},
Results: strings.Split(*results, ","),
Timeout: timeout,
BreakpointOnFailure: *breakpointOnFailure,
OnError: *onError,
StepMetadataDir: *stepMetadataDir,
PostWriter: &realPostWriter{},
Results: strings.Split(*results, ","),
Timeout: timeout,
BreakpointOnFailure: *breakpointOnFailure,
OnError: *onError,
StepMetadataDir: *stepMetadataDir,
DontSendResultsToTerminationPath: *dontSendResultsToTerminationPath,
}

// Copy any creds injected by the controller into the $HOME directory of the current
Expand Down
13 changes: 13 additions & 0 deletions config/enable-log-access-to-controller/clusterrole.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: tekton-pipelines-controller-pod-log-access
labels:
app.kubernetes.io/component: controller
app.kubernetes.io/instance: default
app.kubernetes.io/part-of: tekton-pipelines
rules:
- apiGroups: [""]
# Controller needs to get the logs of the results sidecar created by TaskRuns to extract results.
resources: ["pods/log"]
verbs: ["get"]
16 changes: 16 additions & 0 deletions config/enable-log-access-to-controller/clusterrolebinding.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: tekton-pipelines-controller-pod-log-access
labels:
app.kubernetes.io/component: controller
app.kubernetes.io/instance: default
app.kubernetes.io/part-of: tekton-pipelines
subjects:
- kind: ServiceAccount
name: tekton-pipelines-controller
namespace: tekton-pipelines
roleRef:
kind: ClusterRole
name: tekton-pipelines-controller-pod-log-access
apiGroup: rbac.authorization.k8s.io
52 changes: 52 additions & 0 deletions docs/install.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@ This guide explains how to install Tekton Pipelines. It covers the following top
- [Customizing the Pipelines Controller behavior](#customizing-the-pipelines-controller-behavior)
- [Alpha Features](#alpha-features)
- [Beta Features](#beta-features)
- [Enabling larger results using sidecar logs](#enabling-larger-results-using-sidecar-logs)
- [Configuring High Availability](#configuring-high-availability)
- [Configuring tekton pipeline controller performance](#configuring-tekton-pipeline-controller-performance)
- [Creating a custom release of Tekton Pipelines](#creating-a-custom-release-of-tekton-pipelines)
Expand Down Expand Up @@ -420,6 +421,8 @@ features](#alpha-features) to be used.
name, kind, and API version information for each `TaskRun` and `Run` in the `PipelineRun` instead. Set it to "both" to
do both. For more information, see [Configuring usage of `TaskRun` and `Run` embedded statuses](pipelineruns.md#configuring-usage-of-taskrun-and-run-embedded-statuses).

- `enable-sidecar-logs-results`: Set this flag to "true" to enable use of a results sidecar logs to extract results larger than the size of the termination message. While termination message restrics the combined size of results to 4K per pod, enabling this feature will allow 1K per result (as many results as required).

For example:

```yaml
Expand Down Expand Up @@ -467,6 +470,55 @@ the `feature-flags` ConfigMap alongside your Tekton Pipelines deployment via

For beta versions of Tekton CRDs, setting `enable-api-fields` to "beta" is the same as setting it to "stable".

## Enabling larger results using sidecar logs

**Note**: The maximum size of a Task's results is limited by the container termination message feature of Kubernetes, as results are passed back to the controller via this mechanism. At present, the limit is “4096 bytes”.

To exceed this limit of 4096 bytes, you can enable larger results using sidecar logs. By enabling this feature, you will have a limit of 1024 bytes per result with no restriction on the number of results.

**Note**: to enable this feature, you need to grant `get` access to all `pods/log` to the `Tekton pipeline controller`. This means that the tekton pipeline controller has the ability to access the pod logs.

1. Create a cluster role by applying the following spec.

```yaml
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: tekton-pipelines-controller-pod-log-access
labels:
app.kubernetes.io/component: controller
app.kubernetes.io/instance: default
app.kubernetes.io/part-of: tekton-pipelines
rules:
- apiGroups: [""]
# Controller needs to get the logs of the results sidecar created by TaskRuns to extract results.
resources: ["pods/log"]
verbs: ["get"]
```

2. Create a cluster role binding by applying the folowing spec.

```yaml
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: tekton-pipelines-controller-pod-log-access
labels:
app.kubernetes.io/component: controller
app.kubernetes.io/instance: default
app.kubernetes.io/part-of: tekton-pipelines
subjects:
- kind: ServiceAccount
name: tekton-pipelines-controller
namespace: tekton-pipelines
roleRef:
kind: ClusterRole
name: tekton-pipelines-controller-pod-log-access
apiGroup: rbac.authorization.k8s.io
```

3. Enable the feature flag to use sidecar logs by setting `enable-sidecar-logs-results: "true"` in the [configMap](#customizing-the-pipelines-controller-behavior).
## Configuring High Availability
If you want to run Tekton Pipelines in a way so that webhooks are resiliant against failures and support
Expand Down
12 changes: 11 additions & 1 deletion docs/tasks.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@ weight: 200
- [Specifying `Resources`](#specifying-resources)
- [Specifying `Workspaces`](#specifying-workspaces)
- [Emitting `Results`](#emitting-results)
- [Larger `Results` using sidecar logs](#larger-results-using-sidecar-logs)
- [Specifying `Volumes`](#specifying-volumes)
- [Specifying a `Step` template](#specifying-a-step-template)
- [Specifying `Sidecars`](#specifying-sidecars)
Expand Down Expand Up @@ -835,7 +836,7 @@ This also means that the number of Steps in a Task affects the maximum size of a
as each Step is implemented as a container in the TaskRun's pod.
The more containers we have in our pod, *the smaller the allowed size of each container's
message*, meaning that the **more steps you have in a Task, the smaller the result for each step can be**.
For example, if you have 10 steps, the size of each step's Result will have a maximum of less than 1KB*.
For example, if you have 10 steps, the size of each step's Result will have a maximum of less than 1KB.

If your `Task` writes a large number of small results, you can work around this limitation
by writing each result from a separate `Step` so that each `Step` has its own termination message.
Expand All @@ -847,6 +848,15 @@ available size will less than 4096 bytes.
As a general rule-of-thumb, if a result needs to be larger than a kilobyte, you should likely use a
[`Workspace`](#specifying-workspaces) to store and pass it between `Tasks` within a `Pipeline`.

#### Larger `Results` using sidecar logs

This is an experimental feature. The `enable-sidecar-logs-results` feature flag must be set to `"true"`](./install.md#enabling-larger-results-using-sidecar-logs)

Instead of using termination messages to store results, the taskrun controller injects a sidecar container which monitors the results of all the steps. The sidecar mounts the volume where results of all the steps are stored. As soon as it finds a new result, it logs it to std out. The controller has access to the logs of the sidecar container (Caution: we need you to enable access to [kubernetes pod/logs](./install.md#enabling-larger-results-using-sidecar-logs).

**Note**: This feature allows users to store up to `1 KB per result`. Because we are not limited by the size of the termination messages, users can have as many results as they require where each result can be up to 1 KB in size. If the size of a result exceeds 1KB, then the TaskRun will be placed into a failed state with the following message: `Result exceeded the maximum allowed limit of 1024 bytes.`


### Specifying `Volumes`

Specifies one or more [`Volumes`](https://kubernetes.io/docs/concepts/storage/volumes/) that the `Steps` in your
Expand Down
81 changes: 81 additions & 0 deletions examples/v1beta1/pipelineruns/alpha/pipelinerun-large-results.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,81 @@
apiVersion: tekton.dev/v1beta1
kind: Task
metadata:
name: large-result
spec:
results:
- name: result1
- name: result2
- name: result3
- name: result4
- name: result5
steps:
- name: step1
image: alpine
script: |
cat /dev/urandom | head -c 750 | base64 | tee $(results.result1.path);
cat /dev/urandom | head -c 750 | base64 | tee $(results.result2.path);
cat /dev/urandom | head -c 750 | base64 | tee $(results.result3.path);
cat /dev/urandom | head -c 750 | base64 | tee $(results.result4.path);
cat /dev/urandom | head -c 750 | base64 | tee $(results.result5.path);
---
apiVersion: tekton.dev/v1beta1
kind: Task
metadata:
name: concat-text
spec:
params:
- name: param1
- name: param2
- name: param3
results:
- name: concatenated-text
description: concatenate strings
steps:
- name: concat
image: alpine
command: ["/bin/sh", "-c"]
args:
- echo $(params.param1) +++ $(params.param2) +++ $(params.param3)| tee $(results.concatenated-text.path) ;
---
apiVersion: tekton.dev/v1beta1
kind: Pipeline
metadata:
name: concat-text-pipeline
spec:
tasks:
- name: first-task
taskRef:
name: large-result
- name: second-task
taskRef:
name: large-result
- name: third-task
taskRef:
name: large-result
- name: last-task
runAfter:
- first-task
- second-task
- third-task
params:
- name: param1
value: $(tasks.first-task.results.result1)
- name: param2
value: $(tasks.second-task.results.result3)
- name: param3
value: $(tasks.third-task.results.result5)
taskRef:
name: concat-text
results:
- name: sum
description: the concat of all texts
value: $(tasks.last-task.results.concatenated-text)
---
apiVersion: tekton.dev/v1beta1
kind: PipelineRun
metadata:
name: concat-text-pipeline-run
spec:
pipelineRef:
name: concat-text-pipeline
28 changes: 28 additions & 0 deletions examples/v1beta1/taskruns/alpha/large-task-result.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
apiVersion: tekton.dev/v1beta1
kind: TaskRun
metadata:
generateName: larger-results-
spec:
taskSpec:
description: |
A task that creates results > termination message limit of 4K per pod!
results:
- name: result1
- name: result2
- name: result3
- name: result4
- name: result5
steps:
- name: step1
image: bash:latest
script: |
#!/usr/bin/env bash
cat /dev/urandom | head -c 750 | base64 | tee /tekton/results/result1 #about 1 K result
cat /dev/urandom | head -c 750 | base64 | tee /tekton/results/result2 #about 1 K result
- name: step2
image: bash:latest
script: |
#!/usr/bin/env bash
cat /dev/urandom | head -c 750 | base64 | tee /tekton/results/result3 #about 1 K result
cat /dev/urandom | head -c 750 | base64 | tee /tekton/results/result4 #about 1 K result
cat /dev/urandom | head -c 750 | base64 | tee /tekton/results/result5 #about 1 K result
7 changes: 7 additions & 0 deletions pkg/apis/config/feature_flags.go
Original file line number Diff line number Diff line change
Expand Up @@ -64,6 +64,8 @@ const (
DefaultEmbeddedStatus = FullEmbeddedStatus
// DefaultEnableSpire is the default value for "enable-spire".
DefaultEnableSpire = false
// DefaultSidecarLogsResults is the default value for "enable-larger-results".
DefaultSidecarLogsResults = false

disableAffinityAssistantKey = "disable-affinity-assistant"
disableCredsInitKey = "disable-creds-init"
Expand All @@ -76,6 +78,7 @@ const (
sendCloudEventsForRuns = "send-cloudevents-for-runs"
embeddedStatus = "embedded-status"
enableSpire = "enable-spire"
enableSidecarLogsResults = "enable-sidecar-logs-results"
)

// FeatureFlags holds the features configurations
Expand All @@ -93,6 +96,7 @@ type FeatureFlags struct {
AwaitSidecarReadiness bool
EmbeddedStatus string
EnableSpire bool
EnableSidecarLogsResults bool
}

// GetFeatureFlagsConfigName returns the name of the configmap containing all
Expand Down Expand Up @@ -144,6 +148,9 @@ func NewFeatureFlagsFromMap(cfgMap map[string]string) (*FeatureFlags, error) {
if err := setEmbeddedStatus(cfgMap, DefaultEmbeddedStatus, &tc.EmbeddedStatus); err != nil {
return nil, err
}
if err := setFeature(enableSidecarLogsResults, DefaultSidecarLogsResults, &tc.EnableSidecarLogsResults); err != nil {
return nil, err
}

// Given that they are alpha features, Tekton Bundles and Custom Tasks should be switched on if
// enable-api-fields is "alpha". If enable-api-fields is not "alpha" then fall back to the value of
Expand Down
2 changes: 2 additions & 0 deletions pkg/apis/pipeline/v1beta1/taskrun_types.go
Original file line number Diff line number Diff line change
Expand Up @@ -181,6 +181,8 @@ const (
TaskRunReasonsResultsVerificationFailed TaskRunReason = "TaskRunResultsVerificationFailed"
// AwaitingTaskRunResults is the reason set when waiting upon `TaskRun` results and signatures to verify
AwaitingTaskRunResults TaskRunReason = "AwaitingTaskRunResults"
// TaskRunReasonTimedOut is the reason set when the Taskrun has timed out
TaskRunReasonResultLargerThanAllowedLimit TaskRunReason = "TaskRunResultLargerThanAllowedLimit"
)

func (t TaskRunReason) String() string {
Expand Down
5 changes: 4 additions & 1 deletion pkg/entrypoint/entrypointer.go
Original file line number Diff line number Diff line change
Expand Up @@ -80,6 +80,9 @@ type Entrypointer struct {
OnError string
// StepMetadataDir is the directory for a step where the step related metadata can be stored
StepMetadataDir string

//Dont Send results to the termination path.
DontSendResultsToTerminationPath bool
}

// Waiter encapsulates waiting for files to exist.
Expand Down Expand Up @@ -183,7 +186,7 @@ func (e Entrypointer) Go() error {

// strings.Split(..) with an empty string returns an array that contains one element, an empty string.
// This creates an error when trying to open the result folder as a file.
if len(e.Results) >= 1 && e.Results[0] != "" {
if !e.DontSendResultsToTerminationPath && len(e.Results) >= 1 && e.Results[0] != "" {
if err := e.readResultsFromDisk(pipeline.DefaultResultPath); err != nil {
logger.Fatalf("Error while handling results: %s", err)
}
Expand Down
5 changes: 4 additions & 1 deletion pkg/pod/entrypoint.go
Original file line number Diff line number Diff line change
Expand Up @@ -108,7 +108,7 @@ var (
// command, we must have fetched the image's ENTRYPOINT before calling this
// method, using entrypoint_lookup.go.
// Additionally, Step timeouts are added as entrypoint flag.
func orderContainers(commonExtraEntrypointArgs []string, steps []corev1.Container, taskSpec *v1beta1.TaskSpec, breakpointConfig *v1beta1.TaskRunDebug, waitForReadyAnnotation bool) ([]corev1.Container, error) {
func orderContainers(commonExtraEntrypointArgs []string, steps []corev1.Container, taskSpec *v1beta1.TaskSpec, breakpointConfig *v1beta1.TaskRunDebug, waitForReadyAnnotation bool, isSidecarLogsResultsEnabled bool) ([]corev1.Container, error) {
if len(steps) == 0 {
return nil, errors.New("No steps specified")
}
Expand All @@ -133,6 +133,9 @@ func orderContainers(commonExtraEntrypointArgs []string, steps []corev1.Containe
"-termination_path", terminationPath,
"-step_metadata_dir", filepath.Join(runDir, idx, "status"),
)
if isSidecarLogsResultsEnabled == true {
argsForEntrypoint = append(argsForEntrypoint, "-dont_send_results_to_termination_path")
}
argsForEntrypoint = append(argsForEntrypoint, commonExtraEntrypointArgs...)
if taskSpec != nil {
if taskSpec.Steps != nil && len(taskSpec.Steps) >= i+1 {
Expand Down
Loading

0 comments on commit 06bbc1d

Please sign in to comment.