-
Notifications
You must be signed in to change notification settings - Fork 102
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add message with error to the plan status to ease debugging #1050
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work! Aside from a few nits, we should figure out whether we propagate the message up and leave with somewhat repetitive plan status output.
Status ExecutionStatus `json:"status,omitempty"` | ||
} | ||
|
||
func (s *StepStatus) SetStatus(status ExecutionStatus) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
naming nit: I'm not a big fan of Set
ters and Get
ter in method names in Go. Or maybe it's just Set
and SetWithMessage
e.g. phaseStatus.Set(ERROR, "something bad happened")
?
pkg/engine/workflow/engine.go
Outdated
phaseStatus.Status = v1beta1.ExecutionFatalError | ||
stepStatus.Status = v1beta1.ExecutionFatalError | ||
planStatus.Status = v1beta1.ExecutionFatalError | ||
phaseStatus.SetStatusWithMessage(v1beta1.ExecutionFatalError, err.Error()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thinking out loud: it was ok to propagate an ExecutionFatalError
up, but do we want to do this for the message too? It seems somewhat verbose and redundant in the output. I'm slightly inclined to let the FATAL_ERROR message live at the level it occurred and only propagate the status up. Wdyt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I was not sure here as well... probably keeping it at the lowest level and not propagating up could be a good enough solution? Let's hear from @mpereira since he requested this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would a plan's error message be an aggregation of its phases/steps error messages in case of multiple errored steps? Maybe it makes sense keeping error messages in the leaf steps? Assuming a possible future DAG plan engine, that would probably be the way to go, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep ... so if I understand it correctly we're all leaning toward propagating status but not propagating message. I'll adjust the PR...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looks good to me with some nits, I'm not going to hard request changes if they're not appropriate to this PR, so approved.
phaseStatus.Status = v1beta1.ExecutionFatalError | ||
stepStatus.Status = v1beta1.ExecutionFatalError | ||
planStatus.Status = v1beta1.ExecutionFatalError | ||
phaseStatus.Set(v1beta1.ExecutionFatalError) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it would be nice to move some of this out to a further helper function that sets all 3 at once, this is repetitive through the cases and is distracting me 😂
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
:D I see, sometimes we set three, sometimes we set two, sometimes one... and it's not a doubly linked list so you cannot traverse from step up to propagate it :/ do you feel like doing something like:
propagateStatus(plan, phase, step)
makes thing better? 🤔
@@ -175,7 +175,7 @@ func TestExecutePlan(t *testing.T) { | |||
expectedStatus: &v1beta1.PlanStatus{ | |||
Status: v1beta1.ExecutionInProgress, | |||
Name: "test", | |||
Phases: []v1beta1.PhaseStatus{{Name: "phase", Status: v1beta1.ExecutionInProgress, Steps: []v1beta1.StepStatus{{Status: v1beta1.ErrorStatus, Name: "step"}}}}, | |||
Phases: []v1beta1.PhaseStatus{{Name: "phase", Status: v1beta1.ExecutionInProgress, Steps: []v1beta1.StepStatus{{Status: v1beta1.ErrorStatus, Name: "step", Message: "A transient error when executing task test.phase.step.task. Will retry. dummy error"}}}}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's not call it a dummy error - it sounds more like we threw it just to throw it, so ignore it, when it's a true error, just transient (as you mention). "Retrying until this step succeeds" or something.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's only in the test 😉
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah I am keeping it since it's just tests so I think it does not matter :) let me know if there's really a strong reason to change this
@@ -207,7 +207,7 @@ func TestExecutePlan(t *testing.T) { | |||
expectedStatus: &v1beta1.PlanStatus{ | |||
Status: v1beta1.ExecutionFatalError, | |||
Name: "test", | |||
Phases: []v1beta1.PhaseStatus{{Name: "phase", Status: v1beta1.ExecutionFatalError, Steps: []v1beta1.StepStatus{{Status: v1beta1.ExecutionFatalError, Name: "step"}}}}, | |||
Phases: []v1beta1.PhaseStatus{{Name: "phase", Status: v1beta1.ExecutionFatalError, Steps: []v1beta1.StepStatus{{Status: v1beta1.ExecutionFatalError, Message: "Error during execution: fatal error: default/test-instance failed in test.phase.step.task: dummy error", Name: "step"}}}}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same
@@ -319,7 +319,7 @@ func TestExecutePlan(t *testing.T) { | |||
Status: v1beta1.ExecutionInProgress, | |||
Name: "test", | |||
Phases: []v1beta1.PhaseStatus{{Name: "phase", Status: v1beta1.ExecutionInProgress, Steps: []v1beta1.StepStatus{ | |||
{Name: "stepOne", Status: v1beta1.ErrorStatus}, | |||
{Name: "stepOne", Status: v1beta1.ErrorStatus, Message: "A transient error when executing task test.phase.stepOne.taskOne. Will retry. dummy error"}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same
@@ -367,7 +367,7 @@ func TestExecutePlan(t *testing.T) { | |||
Status: v1beta1.ExecutionInProgress, | |||
Name: "test", | |||
Phases: []v1beta1.PhaseStatus{{Name: "phase", Status: v1beta1.ExecutionInProgress, Steps: []v1beta1.StepStatus{ | |||
{Name: "stepOne", Status: v1beta1.ErrorStatus}, | |||
{Name: "stepOne", Status: v1beta1.ErrorStatus, Message: "A transient error when executing task test.phase.stepOne.taskOne. Will retry. dummy error"}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same
@@ -415,7 +415,7 @@ func TestExecutePlan(t *testing.T) { | |||
Status: v1beta1.ExecutionFatalError, | |||
Name: "test", | |||
Phases: []v1beta1.PhaseStatus{{Name: "phase", Status: v1beta1.ExecutionFatalError, Steps: []v1beta1.StepStatus{ | |||
{Name: "stepOne", Status: v1beta1.ExecutionFatalError}, | |||
{Name: "stepOne", Status: v1beta1.ExecutionFatalError, Message: "Error during execution: fatal error: default/test-instance failed in test.phase.stepOne.taskOne: dummy error"}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same
@@ -463,7 +463,7 @@ func TestExecutePlan(t *testing.T) { | |||
Status: v1beta1.ExecutionInProgress, | |||
Name: "test", | |||
Phases: []v1beta1.PhaseStatus{ | |||
{Name: "phaseOne", Status: v1beta1.ExecutionInProgress, Steps: []v1beta1.StepStatus{{Name: "step", Status: v1beta1.ErrorStatus}}}, | |||
{Name: "phaseOne", Status: v1beta1.ExecutionInProgress, Steps: []v1beta1.StepStatus{{Name: "step", Status: v1beta1.ErrorStatus, Message: "A transient error when executing task test.phaseOne.step.taskOne. Will retry. dummy error"}}}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you get the idea 😂
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lgtm! 🚢
I am going to merge this, I gave it one more try locally and it looks good. I am happy to follow up on the comments @gerred made, feel free to reach out |
Thanks @alenkacz! |
What this PR does / why we need it:
Output from local testing
Fixes #873