Skip to content

Commit

Permalink
Remove extra log PVC
Browse files Browse the repository at this point in the history
We noticed early on that logs from init containers are often cleaned up
immediately by k8s, particularly if the containers are short running
(e.g. just echoing "hello world"). We started down a path to correct
that, which takes an approach based on Prow's entrypoint solution
(https://github.com/kubernetes/test-infra/tree/master/prow/cmd/entrypoint)
(even using the same image at the moment!) which wraps the user's
provided command and streams logs to a volume, from which the logs can
be uploaded/streamed by a sidecar.

Since we are using init containers for step execution, we can't yet use
sidecars, but we are addressing that in tektoncd#224 (also an entrypoint
re-writing based solution). Once we have that, we can sidecar support,
starting with GCS as a POC (#107) and moving into other types.

In the meantime, to enable us to get logs (particularly in tests), we
had the taskrun controller create a PVC on the fly to hold logs. This
has two problems:
* The PVCs are not cleaned up so this is an unexpected side effect for
  users
* Combined with PVC based input + ouput linking, this causes scheduling
  problems for the resulting pods (tektoncd#375)

Now that we want to have an official release, this would be a bad state
to release in, so we will remove this magical log PVC creation logic,
which was never our intended end state anyway.

Since we _do_ need the entrypoint rewriting and log interception logic
in the long run, this commit leaves most functionality intact, removing
only the PVC creation and changing the volume being used to an
`emptyDir`, which is what we will likely use for #107 (and this is how
Prow handles this as well). This means the released functionality will
be streaming logs to a location where nothing can read them, however I
think it is better than completely removing the functionality b/c:
1. We need the functionality in the long run
2. Users should be prepared for this functionality (e.g. dealing with
   edge cases around the taskrun controller being able to fetch an
   image's entrypoint)

Fixes tektoncd#387
  • Loading branch information
bobcatfish committed Jan 30, 2019
1 parent 23a2b91 commit f357829
Show file tree
Hide file tree
Showing 11 changed files with 189 additions and 390 deletions.
11 changes: 8 additions & 3 deletions docs/using.md
Original file line number Diff line number Diff line change
Expand Up @@ -123,9 +123,14 @@ specific contract.
#### Entrypoint

When containers are run in a `Task`, the `entrypoint` of the container will be
overwritten with a custom binary that redirects the logs to a separate location
for aggregating the log output. As such, it is always recommended to explicitly
specify a command.
overwritten with a custom binary. The plan is to use this custom binary for
controlling the execution of step containers ([#224](https://github.com/knative/build-pipeline/issues/224)) and log streaming
[#107](https://github.com/knative/build-pipeline/issues/107), though currently
it will write logs only to an [`emptyDir`]() (which cannot be read from after
the pod has finished executing, so logs must be obtained
[via k8s logs](https://kubernetes.io/docs/concepts/cluster-administration/logging/),
using a tool such as [test/logs/README.md](../test/logs/README.md),
or setting up an external system to consume logs).

When `command` is not explicitly set, the controller will attempt to lookup the
entrypoint from the remote registry.
Expand Down
60 changes: 7 additions & 53 deletions pkg/reconciler/v1alpha1/taskrun/taskrun.go
Original file line number Diff line number Diff line change
Expand Up @@ -39,11 +39,9 @@ import (
corev1 "k8s.io/api/core/v1"
"k8s.io/apimachinery/pkg/api/equality"
"k8s.io/apimachinery/pkg/api/errors"
"k8s.io/apimachinery/pkg/api/resource"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
"k8s.io/apimachinery/pkg/runtime/schema"
coreinformers "k8s.io/client-go/informers/core/v1"
"k8s.io/client-go/kubernetes"
"k8s.io/client-go/tools/cache"
)

Expand Down Expand Up @@ -71,8 +69,6 @@ const (
taskRunAgentName = "taskrun-controller"
// taskRunControllerName defines name for TaskRun Controller
taskRunControllerName = "TaskRun"

pvcSizeBytes = 5 * 1024 * 1024 * 1024 // 5 GBs
)

var (
Expand Down Expand Up @@ -276,19 +272,8 @@ func (c *Reconciler) reconcile(ctx context.Context, tr *v1alpha1.TaskRun) error
return err
}
} else {
pvc, err := c.KubeClientSet.CoreV1().PersistentVolumeClaims(tr.Namespace).Get(tr.Name, metav1.GetOptions{})
if errors.IsNotFound(err) {
// Create a persistent volume claim to hold Build logs
pvc, err = createPVC(c.KubeClientSet, tr)
if err != nil {
return fmt.Errorf("Failed to create persistent volume claim %s for task %q: %v", tr.Name, err, tr.Name)
}
} else if err != nil {
c.Logger.Errorf("Failed to reconcile taskrun: %q, failed to get pvc %q: %v", tr.Name, tr.Name, err)
return err
}
// Build pod is not present, create build pod.
pod, err = c.createBuildPod(ctx, tr, rtr.TaskSpec, rtr.TaskName, pvc.Name)
pod, err = c.createBuildPod(ctx, tr, rtr.TaskSpec, rtr.TaskName)
if err != nil {
// This Run has failed, so we need to mark it as failed and stop reconciling it
var msg string
Expand Down Expand Up @@ -368,40 +353,9 @@ func (c *Reconciler) updateStatus(taskrun *v1alpha1.TaskRun) (*v1alpha1.TaskRun,
return newtaskrun, nil
}

// createPVC will create a persistent volume mount for tr which
// will be used to gather logs using the entrypoint wrapper
func createPVC(kc kubernetes.Interface, tr *v1alpha1.TaskRun) (*corev1.PersistentVolumeClaim, error) {
v, err := kc.CoreV1().PersistentVolumeClaims(tr.Namespace).Create(
&corev1.PersistentVolumeClaim{
ObjectMeta: metav1.ObjectMeta{
Namespace: tr.Namespace,
// This pvc is specific to this TaskRun, so we'll use the same name
Name: tr.Name,
OwnerReferences: []metav1.OwnerReference{
*metav1.NewControllerRef(tr, groupVersionKind),
},
},
Spec: corev1.PersistentVolumeClaimSpec{
AccessModes: []corev1.PersistentVolumeAccessMode{
corev1.ReadWriteOnce,
},
Resources: corev1.ResourceRequirements{
Requests: map[corev1.ResourceName]resource.Quantity{
corev1.ResourceStorage: *resource.NewQuantity(pvcSizeBytes, resource.BinarySI),
},
},
},
},
)
if err != nil {
return nil, fmt.Errorf("failed to claim Persistent Volume %q due to error: %s", tr.Name, err)
}
return v, nil
}

// createPod creates a Pod based on the Task's configuration, with pvcName as a
// volumeMount
func (c *Reconciler) createBuildPod(ctx context.Context, tr *v1alpha1.TaskRun, ts *v1alpha1.TaskSpec, taskName, pvcName string) (*corev1.Pod, error) {
func (c *Reconciler) createBuildPod(ctx context.Context, tr *v1alpha1.TaskRun, ts *v1alpha1.TaskSpec, taskName string) (*corev1.Pod, error) {
// TODO: Preferably use Validate on task.spec to catch validation error
bs := ts.GetBuildSpec()
if bs == nil {
Expand Down Expand Up @@ -434,7 +388,7 @@ func (c *Reconciler) createBuildPod(ctx context.Context, tr *v1alpha1.TaskRun, t
}
}

build, err := createRedirectedBuild(ctx, bSpec, pvcName, tr)
build, err := createRedirectedBuild(ctx, bSpec, tr)
if err != nil {
return nil, fmt.Errorf("couldn't create redirected Build: %v", err)
}
Expand Down Expand Up @@ -488,7 +442,7 @@ func (c *Reconciler) createBuildPod(ctx context.Context, tr *v1alpha1.TaskRun, t
// an entrypoint cache creates a build where all entrypoints are switched to
// be the entrypoint redirector binary. This function assumes that it receives
// its own copy of the BuildSpec and modifies it freely
func createRedirectedBuild(ctx context.Context, bs *buildv1alpha1.BuildSpec, pvcName string, tr *v1alpha1.TaskRun) (*buildv1alpha1.Build, error) {
func createRedirectedBuild(ctx context.Context, bs *buildv1alpha1.BuildSpec, tr *v1alpha1.TaskRun) (*buildv1alpha1.Build, error) {
// Pass service account name from taskrun to build
bs.ServiceAccountName = tr.Spec.ServiceAccount

Expand Down Expand Up @@ -519,9 +473,9 @@ func createRedirectedBuild(ctx context.Context, bs *buildv1alpha1.BuildSpec, pvc
b.Spec.Volumes = append(b.Spec.Volumes, corev1.Volume{
Name: entrypoint.MountName,
VolumeSource: corev1.VolumeSource{
PersistentVolumeClaim: &corev1.PersistentVolumeClaimVolumeSource{
ClaimName: pvcName,
},
// TODO(#107) we need to actually stream these logs somewhere, probably via sidecar.
// Currently these logs will be lost when the pod is unscheduled.
EmptyDir: &corev1.EmptyDirVolumeSource{},
},
})

Expand Down
51 changes: 2 additions & 49 deletions pkg/reconciler/v1alpha1/taskrun/taskrun_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,6 @@ import (
"go.uber.org/zap"
"go.uber.org/zap/zaptest/observer"
corev1 "k8s.io/api/core/v1"
"k8s.io/apimachinery/pkg/api/resource"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
"k8s.io/apimachinery/pkg/runtime"
fakekubeclientset "k8s.io/client-go/kubernetes/fake"
Expand Down Expand Up @@ -108,36 +107,11 @@ var (
))
)

func getExpectedPVC(tr *v1alpha1.TaskRun) *corev1.PersistentVolumeClaim {
return &corev1.PersistentVolumeClaim{
ObjectMeta: metav1.ObjectMeta{
Namespace: tr.Namespace,
// This pvc is specific to this TaskRun, so we'll use the same name
Name: tr.Name,
OwnerReferences: []metav1.OwnerReference{
*metav1.NewControllerRef(tr, groupVersionKind),
},
},
Spec: corev1.PersistentVolumeClaimSpec{
AccessModes: []corev1.PersistentVolumeAccessMode{
corev1.ReadWriteOnce,
},
Resources: corev1.ResourceRequirements{
Requests: map[corev1.ResourceName]resource.Quantity{
corev1.ResourceStorage: *resource.NewQuantity(pvcSizeBytes, resource.BinarySI),
},
},
},
}
}

func getToolsVolume(claimName string) corev1.Volume {
return corev1.Volume{
Name: toolsMountName,
VolumeSource: corev1.VolumeSource{
PersistentVolumeClaim: &corev1.PersistentVolumeClaimVolumeSource{
ClaimName: claimName,
},
EmptyDir: &corev1.EmptyDirVolumeSource{},
},
}
}
Expand Down Expand Up @@ -422,27 +396,6 @@ func TestReconcile(t *testing.T) {
if len(clients.Kube.Actions()) == 0 {
t.Fatalf("Expected actions to be logged in the kubeclient, got none")
}

pvc, err := clients.Kube.CoreV1().PersistentVolumeClaims(namespace).Get(name, metav1.GetOptions{})
if err != nil {
t.Errorf("Failed to fetch build: %v", err)
}

expectedVolume := getExpectedPVC(tr)
if d := cmp.Diff(pvc.Name, expectedVolume.Name); d != "" {
t.Errorf("pvc doesn't match, diff: %s", d)
}
if d := cmp.Diff(pvc.OwnerReferences, expectedVolume.OwnerReferences); d != "" {
t.Errorf("pvc doesn't match, diff: %s", d)
}
if d := cmp.Diff(pvc.Spec.AccessModes, expectedVolume.Spec.AccessModes); d != "" {
t.Errorf("pvc doesn't match, diff: %s", d)
}
if pvc.Spec.Resources.Requests["storage"] != expectedVolume.Spec.Resources.Requests["storage"] {
t.Errorf("pvc doesn't match, got: %v, expected: %v",
pvc.Spec.Resources.Requests["storage"],
expectedVolume.Spec.Resources.Requests["storage"])
}
})
}
}
Expand Down Expand Up @@ -787,7 +740,7 @@ func TestCreateRedirectedBuild(t *testing.T) {
expectedSteps := len(bs.Steps) + 1
expectedVolumes := len(bs.Volumes) + 1

b, err := createRedirectedBuild(ctx, &bs, "pvc", tr)
b, err := createRedirectedBuild(ctx, &bs, tr)
if err != nil {
t.Errorf("expected createRedirectedBuild to pass: %v", err)
}
Expand Down
8 changes: 4 additions & 4 deletions test/cancel_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@ func TestTaskRunPipelineRunCancel(t *testing.T) {
tb.Step("foo", "ubuntu", tb.Command("/bin/bash"), tb.Args("-c", "sleep 500")),
))
if _, err := c.TaskClient.Create(task); err != nil {
t.Fatalf("Failed to create Task `%s`: %s", hwTaskName, err)
t.Fatalf("Failed to create Task `banana`: %s", err)
}

pipeline := tb.Pipeline("tomatoes", namespace,
Expand All @@ -64,7 +64,7 @@ func TestTaskRunPipelineRunCancel(t *testing.T) {
c := pr.Status.GetCondition(duckv1alpha1.ConditionSucceeded)
if c != nil {
if c.Status == corev1.ConditionTrue || c.Status == corev1.ConditionFalse {
return true, fmt.Errorf("pipelineRun %s already finished!", "pear")
return true, fmt.Errorf("pipelineRun %s already finished", "pear")
} else if c.Status == corev1.ConditionUnknown && (c.Reason == "Running" || c.Reason == "Pending") {
return true, nil
}
Expand Down Expand Up @@ -114,10 +114,10 @@ func TestTaskRunPipelineRunCancel(t *testing.T) {
}
return false, nil
}, "PipelineRunCancelled"); err != nil {
t.Errorf("Error waiting for TaskRun %s to finish: %s", hwTaskRunName, err)
t.Errorf("Error waiting for PipelineRun `pear` to finish: %s", err)
}

logger.Infof("Waiting for TaskRun %s in namespace %s to be cancelled", hwTaskRunName, namespace)
logger.Infof("Waiting for TaskRun `pear-foo` in namespace %s to be cancelled", namespace)
if err := WaitForTaskRunState(c, "pear-foo", func(tr *v1alpha1.TaskRun) (bool, error) {
c := tr.Status.GetCondition(duckv1alpha1.ConditionSucceeded)
if c != nil {
Expand Down
149 changes: 0 additions & 149 deletions test/crd.go

This file was deleted.

Loading

0 comments on commit f357829

Please sign in to comment.