Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUILD-373] SHIP-0021: Local Source Upload #86

Merged
merged 4 commits into from
Jan 24, 2022

Conversation

otaviof
Copy link
Member

@otaviof otaviof commented Dec 19, 2021

Changes

Local Source Upload

This pull-request introduces the ability to stream a local directory content into a Shipwright BuildRun generated POD. Which means, users can employ Shipwright to build container images out of their local repository clone.

The following steps represents the most common use-case scenario:

  1. Create a Build, if it does not exists already, for instance:
shp build create nodejs-ex \
    --source-url="https://github.com/otaviof/nodejs-ex.git" \
    --output-image="image-registry.openshift-image-registry.svc:5000/otaviof/nodejs-ex:latest"
  1. Clone the repository and make local changes:
git clone https://github.com/otaviof/nodejs-ex.git && \
    cd nodejs-ex
  1. Use the new feature to build a container image for you:
shp build upload nodejs-ex \
    --follow \
    --output-image="image-registry.openshift-image-registry.svc:5000/otaviof/nodejs-ex:demo"
  1. Rinse and repeat the previous step!

Log Follower and PodWatcher

The shp build upload can also follow the build pod logs, using the --follow flag, and to have the new feature sharing the same existing components, the following changes are taking place:

  • PodWatcher and Follower are now managed by Params, instantiated as a factory just like the other components employed throughout the project
  • PodWatcher modified to use a slice of "on event" functions, so others can use the same (shared) instance
  • NewFollower modified to have a different signature, it requires the BuildRun qualified name (namespace and name)

The changes here overlaps with #89.

Testing

The changes on this pull-request have unit testing in place, and it also introduces end-to-end test scenario for the new local source upload feature. For it we simulate a complete build by streaming the local changes onto the build pod, and right after asserting a successful build takes place.

The new feature needs the changes added on the controller PR #934, so now the end-to-end testing is using the nightly build. This should take place only until the next release.

Submitter Checklist

  • Includes tests if functionality changed/was added
  • Includes docs if changes are user-facing
  • Set a kind label on this PR
  • Release notes block has been filled in, or marked NONE

Release Notes

Introducing "shp build upload" sub-command to stream a local repository clone data onto a newly created BuildRun, so users can employ the Shipwright Build Controller to ship container images out of arbitrary content.

@openshift-ci openshift-ci bot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. release-note labels Dec 19, 2021
@otaviof
Copy link
Member Author

otaviof commented Dec 19, 2021

/cc @alicerum

@openshift-ci openshift-ci bot requested a review from alicerum December 19, 2021 09:19
@otaviof otaviof added the kind/feature Categorizes issue or PR as related to a new feature. label Dec 20, 2021
@otaviof otaviof changed the title [WIP] SHIP-0021: Local Source Upload [WIP][BUILD-373] SHIP-0021: Local Source Upload Dec 21, 2021
@openshift-ci openshift-ci bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jan 4, 2022
Copy link
Member

@gabemontero gabemontero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Took a very preliminary pass @otaviof and only have a high level comment for now. And I'll preface with citing that this may already be part of your thought process, even if it is not reflected in this WIP PR yet.

And I'm guessing what I'm about to say it not much of a surprise :-)

Feels like another round of iteration on this makes sense, to get even further code reuse, in addition to the sharing of PodWatcher and Tail between the new Upload Command here and the Follower command over in https://github.com/shipwright-io/cli/blob/b42be6944302d384e4fb0b24cd06f3bbcca64a4f/pkg/shp/cmd/follower/follow.go

In particular, if we add a level of pluggability so that when the Pod hits running state, we can configure /plug in / inject the streaming of content element at

target := &streamer.Target{
Namespace: pod.GetNamespace(),
Pod: pod.GetName(),
Container: containerName,
BaseDir: targetBaseDir,
}
if err := u.performDataStreaming(target); err != nil {
return err
}
into the event handler at
case corev1.PodRunning:
if !f.enteredRunningState {
f.Log(fmt.Sprintf("Pod %q in %q state, starting up log tail", pod.GetName(), corev1.PodRunning))
f.enteredRunningState = true
// graceful time to wait for container start
time.Sleep(3 * time.Second)
// start tailing container logs
f.tailLogs(pod)
}

we can maximize code sharing wrt the Pod event handling, and the upload code could benefit from all the recent timing fixes and error edge cases from reported bugs that are currently captured in

func (f *Follower) OnEvent(pod *corev1.Pod) error {
switch pod.Status.Phase {
case corev1.PodRunning:
if !f.enteredRunningState {
f.Log(fmt.Sprintf("Pod %q in %q state, starting up log tail", pod.GetName(), corev1.PodRunning))
f.enteredRunningState = true
// graceful time to wait for container start
time.Sleep(3 * time.Second)
// start tailing container logs
f.tailLogs(pod)
}
case corev1.PodFailed:
msg := ""
var br *buildv1alpha1.BuildRun
err := wait.PollImmediate(1*time.Second, 15*time.Second, func() (done bool, err error) {
br, err = f.shpClientset.ShipwrightV1alpha1().BuildRuns(pod.Namespace).Get(f.ctx, f.buildRunName, metav1.GetOptions{})
if err != nil {
if kerrors.IsNotFound(err) {
return true, nil
}
f.Log(fmt.Sprintf("error getting buildrun %q for pod %q: %s\n", f.buildRunName, pod.GetName(), err.Error()))
return false, nil
}
if br.IsDone() {
return true, nil
}
return false, nil
})
if err != nil {
f.Log(fmt.Sprintf("gave up trying to get a buildrun %q in a terminal state for pod %q, proceeding with pod failure processing", f.buildRunName, pod.GetName()))
}
switch {
case br == nil:
msg = fmt.Sprintf("BuildRun %q has been deleted.\n", br.Name)
case err == nil && br.IsCanceled():
msg = fmt.Sprintf("BuildRun %q has been canceled.\n", br.Name)
case (err == nil && br.DeletionTimestamp != nil) || (err != nil && kerrors.IsNotFound(err)):
msg = fmt.Sprintf("BuildRun %q has been deleted.\n", br.Name)
case pod.DeletionTimestamp != nil:
msg = fmt.Sprintf("Pod %q has been deleted.\n", pod.GetName())
default:
msg = fmt.Sprintf("Pod %q has failed!\n", pod.GetName())
podBytes, err2 := json.MarshalIndent(pod, "", " ")
if err2 == nil {
msg = fmt.Sprintf("Pod %q has failed!\nPod JSON:\n%s\n", pod.GetName(), string(podBytes))
}
err = fmt.Errorf("build pod %q has failed", pod.GetName())
}
// see if because of deletion or cancelation
f.Log(msg)
f.stop()
return err
case corev1.PodSucceeded:
// encountered scenarios where the build run quickly enough that the pod effectively skips the running state,
// or the events come in reverse order, and we never enter the tail
if !f.enteredRunningState {
f.Log(fmt.Sprintf("succeeded event for pod %q arrived before or in place of running event so dumping logs now", pod.GetName()))
var b strings.Builder
for _, c := range pod.Spec.Containers {
logs, err := util.GetPodLogs(f.ctx, f.kclientset, *pod, c.Name)
if err != nil {
f.Log(fmt.Sprintf("could not get logs for container %q: %s", c.Name, err.Error()))
continue
}
fmt.Fprintf(&b, "*** Pod %q, container %q: ***\n\n", pod.Name, c.Name)
fmt.Fprintln(&b, logs)
}
f.Log(b.String())
}
f.Log(fmt.Sprintf("Pod %q has succeeded!\n", pod.GetName()))
f.stop()
default:
f.Log(fmt.Sprintf("Pod %q is in state %q...\n", pod.GetName(), string(pod.Status.Phase)))
// handle any issues with pulling images that may fail
for _, c := range pod.Status.Conditions {
if c.Type == corev1.PodInitialized || c.Type == corev1.ContainersReady {
if c.Status == corev1.ConditionUnknown {
return fmt.Errorf(c.Message)
}
}
}
}
return nil
}
// OnTimeout reacts to either the context or request timeout causing the pod watcher to exit
func (f *Follower) OnTimeout(msg string) {
f.Log(fmt.Sprintf("BuildRun %q log following has stopped because: %q\n", f.buildRunName, msg))
}
// OnNoPodEventsYet reacts to the pod watcher telling us it has not received any pod events for our build run
func (f *Follower) OnNoPodEventsYet() {
f.Log(fmt.Sprintf("BuildRun %q log following has not observed any pod events yet.\n", f.buildRunName))
br, err := f.shpClientset.ShipwrightV1alpha1().BuildRuns(f.namespace).Get(f.ctx, f.buildRunName, metav1.GetOptions{})
if err != nil {
f.Log(fmt.Sprintf("error accessing BuildRun %q: %s", f.buildRunName, err.Error()))
return
}
c := br.Status.GetCondition(buildv1alpha1.Succeeded)
giveUp := false
msg := ""
switch {
case c != nil && c.Status == corev1.ConditionTrue:
giveUp = true
msg = fmt.Sprintf("BuildRun '%s' has been marked as successful.\n", br.Name)
case c != nil && c.Status == corev1.ConditionFalse:
giveUp = true
msg = fmt.Sprintf("BuildRun '%s' has been marked as failed.\n", br.Name)
case br.IsCanceled():
giveUp = true
msg = fmt.Sprintf("BuildRun '%s' has been canceled.\n", br.Name)
case br.DeletionTimestamp != nil:
giveUp = true
msg = fmt.Sprintf("BuildRun '%s' has been deleted.\n", br.Name)
case !br.HasStarted():
f.Log(fmt.Sprintf("BuildRun '%s' has been marked as failed.\n", br.Name))
}
if giveUp {
f.Log(msg)
f.Log(fmt.Sprintf("exiting 'shp build run --follow' for BuildRun %q", br.Name))
f.stop()
}
}

Feels like a maybe a new parameter that can be supplied to the Follower that if set results in a method callback, where the method signature matches the current u.performDataStreaming(target)

The existing start build and buildrun logs follow would not provide a setting for this new method callback, but Upload would.

WDYT?

@otaviof
Copy link
Member Author

otaviof commented Jan 5, 2022

Feels like another round of iteration on this makes sense, to get even further code reuse, in addition to the sharing of PodWatcher and Tail between the new Upload Command here and the Follower command over in https://github.com/shipwright-io/cli/blob/b42be6944302d384e4fb0b24cd06f3bbcca64a4f/pkg/shp/cmd/follower/follow.go

[...]

WDYT?

Thanks for the inputs, Gabe! It does make sense in terms of how to consolidate those two features, I'll look at it in more detail soon.

@otaviof otaviof force-pushed the ship-0021-upload branch 2 times, most recently from bf422cf to d309598 Compare January 14, 2022 07:05
@openshift-ci openshift-ci bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jan 14, 2022
@otaviof
Copy link
Member Author

otaviof commented Jan 18, 2022

@gabemontero, going back on the subject of reusing the Follower, please consider my latest commit here. You will find the following:

  • PodWatcher and Follower managed by Params, instantiated as a factory by direct usage of the Param attributes
  • PodWatcher uses as a slice of functions, so components can use a shared instance of the PodWatcher
  • Follower has a different signature, and requires the BuildRun name (namespace and name) during construction, then during shp build run the Follower is only instantiated during Run (instead of Complete like before)

The last item here has a direct impact on #89 discussions, since what we've discussed became a logical decision during the latest commit.

I wonder if we should create a spinoff PR with the last commit, what do you think @gabemontero and @SaschaSchwarze0? (Thanks in advance!)

@otaviof otaviof mentioned this pull request Jan 18, 2022
4 tasks
@SaschaSchwarze0
Copy link
Member

@gabemontero, going back on the subject of reusing the Follower, please consider my latest commit here. You will find the following:

* `PodWatcher` and `Follower` managed by `Params`, instantiated as a factory by direct usage of the Param attributes

* `PodWatcher` uses as a slice of functions, so components can use a shared instance of the `PodWatcher`

* `Follower` has a different signature, and requires the `BuildRun` name (namespace and name) during construction, then during `shp build run` the `Follower` is only instantiated during `Run` (instead of `Complete` like before)

The last item here has a direct impact on #89 discussions, since what we've discussed became a logical decision during the latest commit.

I wonder if we should create a spin-off pull-request with the last commit, what do you think @gabemontero and @SaschaSchwarze0? (Thanks in advance!)

I am fine taking any reasonable solution for the problem addressed in #89. I have not been as engaged as elsewhere when it comes to the structure of our CLI, and therefore have no too strong opinion.

Copy link
Member

@gabemontero gabemontero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just a nit on one of the comments you updated @otaviof

code changes look good per our discussion earlier today and the summary from that discussion I posted in #89 (comment)

like how these different components are converging

@otaviof
Copy link
Member Author

otaviof commented Jan 19, 2022

/hold The changes on this pull-request are depending on build #934, after the merge, we need to update go.mod and the vendored modules.

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jan 19, 2022
@otaviof
Copy link
Member Author

otaviof commented Jan 19, 2022

/retitle [BUILD-373] SHIP-0021: Local Source Upload

@openshift-ci openshift-ci bot changed the title [WIP][BUILD-373] SHIP-0021: Local Source Upload [BUILD-373] SHIP-0021: Local Source Upload Jan 19, 2022
@openshift-ci openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jan 19, 2022
Copy link
Member

@gabemontero gabemontero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm good with the changes @otaviof but since I see some test failures both with e2e and unit that could be related from your changes, I'll refrain on lgtm'ing until you get those clean

Quick hint, with the e2e failure, it looks like a hang. If that persists, I would not be surprised if it is in one of the follow logs e2e runs. I've noticed that the BATS stuff is not good with dealing with hangs. I tried to document in #81 (comment) the temporary changes I had to make there to debug, where I temporarily bypassed BATS with the e2e tests and ran the shp commands directly to get data.

@otaviof otaviof force-pushed the ship-0021-upload branch 2 times, most recently from 1106519 to 7907d6d Compare January 21, 2022 10:07
Users can upload local content via the newly added `shp build upload`.
@otaviof otaviof force-pushed the ship-0021-upload branch 4 times, most recently from 508692f to 6a4607b Compare January 21, 2022 12:02
Instantiating PodWatcher and Follower as factories in Params. Also,
adapting sub-comands using those components to rely on Params directly.
Using `nightly-2022-01-21-1642741753` release in order to simulate the
local source upload.
@otaviof
Copy link
Member Author

otaviof commented Jan 21, 2022

I'm good with the changes @otaviof but since I see some test failures both with e2e and unit that could be related from your changes, I'll refrain on lgtm'ing until you get those clean

Quick hint, with the e2e failure, it looks like a hang. If that persists, I would not be surprised if it is in one of the follow logs e2e runs. I've noticed that the BATS stuff is not good with dealing with hangs. I tried to document in #81 (comment) the temporary changes I had to make there to debug, where I temporarily bypassed BATS with the e2e tests and ran the shp commands directly to get data.

Thanks @gabemontero! I've fixed the end-to-end tests accordingly to the refactoring and the new local source upload. Please consider.

Also, since the controller changes are merged into this pull-request, I remove the "hold" label.

/hold cancel

@openshift-ci openshift-ci bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jan 21, 2022
@gabemontero
Copy link
Member

/approve

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jan 21, 2022

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: gabemontero

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jan 21, 2022
@gabemontero
Copy link
Member

update the release note section in the PR's description to something other than TODO (i.e. announce the new function) and I'll tag for merge @otaviof

@otaviof
Copy link
Member Author

otaviof commented Jan 24, 2022

update the release note section in the PR's description to something other than TODO (i.e. announce the new function) and I'll tag for merge @otaviof

Thanks for the heads up, I almost forgot about it. Please consider the updated description, it should reflect all the changes present on this PR.

@gabemontero
Copy link
Member

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Jan 24, 2022
@openshift-merge-robot openshift-merge-robot merged commit 6c37a52 into shipwright-io:main Jan 24, 2022
@otaviof otaviof deleted the ship-0021-upload branch January 24, 2022 13:47
@SaschaSchwarze0 SaschaSchwarze0 added this to the release-v0.8.0 milestone Jan 26, 2022
@otaviof otaviof mentioned this pull request Jan 28, 2022
4 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. kind/feature Categorizes issue or PR as related to a new feature. lgtm Indicates that a PR is ready to be merged. release-note
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants