Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix traceflow spec validation logic. #5223

Closed
wants to merge 1 commit into from

Conversation

shi0rik0
Copy link
Contributor

@shi0rik0 shi0rik0 commented Jul 8, 2023

Currently, when some traceflow spec error happens, the status of such traceflows will be "Running", not "Failed". This is because some validation logic is located at wrong places. I fixed this problem by moving them to the right place.

@luolanzone luolanzone requested a review from gran-vmv July 10, 2023 06:38
@luolanzone luolanzone added the kind/bug Categorizes issue or PR as related to a bug. label Jul 10, 2023
@luolanzone
Copy link
Contributor

Hi @shi0rik0 please fix the unit test failure.

Currently, when some traceflow spec error happens, the status of such
traceflows will be "Running", not "Failed". This is because some validation
logic is located at wrong places. I fixed this problem by moving them to the
right place.

Signed-off-by: shi0rik0 <anguuan@outlook.com>
@shi0rik0
Copy link
Contributor Author

Hi @shi0rik0 please fix the unit test failure.

I think there is an issue with the unit test. The existing logic is as follows: the controller performs the first validation on the spec, and then the agent performs the second validation on the spec. It would be sufficient to have this validation inside the controller. Therefore, these unit tests are redundant. A workaround is to still keep the validation logic in the agent.

Copy link
Contributor

@gran-vmv gran-vmv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Need to run full CI pipelines before merge.

@shi0rik0
Copy link
Contributor Author

/test-all

1 similar comment
@luolanzone
Copy link
Contributor

/test-all

Copy link
Member

@tnqn tnqn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fix. However, I would suggest to validate it before creating it.

Comment on lines +393 to +399
if tf.Spec.Source.Pod == "" && tf.Spec.Destination.Pod == "" {
return fmt.Errorf("Traceflow %s has neither source nor destination Pod specified", tf.Name)
}

if tf.Spec.Source.Pod == "" && !tf.Spec.LiveTraffic {
return fmt.Errorf("Traceflow %s does not have source Pod specified", tf.Name)
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In fact these validations don't need to perform at runtime. Like other CRDs, they can be prevented from being created with CRD schema validation or webhook validation, which is more friendly to users. They can get immediate feedback when creating the resource and clear message what's the error, instead of having to polling the object to get the error.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, but there are a couple of considerations:

  1. The existing validation logic is performed at runtime, and if we want to switch to using CRD schema validation, it may require significant changes.
  2. Expressing some complex validation logic using CRD schema can be challenging and may result in decreased YAML readability.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's more ideal to prevent the creation of such invalid objects. Using CRD schema/wehbook validation is not that complex, and I reckon Traceflow is probably the only CRD doing such validation at runtime instead of pre-creation, due to historical reasons. But I'm fine to keep the patch as is if you just want the PR go this far.

Yes, implementing the current validation via CRD schema may be not practicable, that's why I mentioned "webhook validation" too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the first time I heard of "webhook validation". It seems like it's a good idea to validate Traceflow CRD using webhook validation. If you wish, I'd like to refactor the validating logic to webhook validation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Switching to webhook validation makes sense to me. You can start from https://github.com/antrea-io/antrea/blob/main/build/charts/antrea/templates/webhooks/validating/crdvalidator.yaml, and search the path "/validate/egress" in the code base to get some references.

Comment on lines +393 to +399
if tf.Spec.Source.Pod == "" && tf.Spec.Destination.Pod == "" {
return fmt.Errorf("Traceflow %s has neither source nor destination Pod specified", tf.Name)
}

if tf.Spec.Source.Pod == "" && !tf.Spec.LiveTraffic {
return fmt.Errorf("Traceflow %s does not have source Pod specified", tf.Name)
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's more ideal to prevent the creation of such invalid objects. Using CRD schema/wehbook validation is not that complex, and I reckon Traceflow is probably the only CRD doing such validation at runtime instead of pre-creation, due to historical reasons. But I'm fine to keep the patch as is if you just want the PR go this far.

Yes, implementing the current validation via CRD schema may be not practicable, that's why I mentioned "webhook validation" too.

@@ -625,7 +625,8 @@ func TestStartTraceflow(t *testing.T) {
tf: &crdv1alpha1.Traceflow{
ObjectMeta: metav1.ObjectMeta{Name: "tf3", UID: "uid3"},
},
expectedErrLog: "Traceflow tf3 has neither source nor destination Pod specified",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this parameter can be removed now given the only two cases using it are changed, and there doesn't seem a reason why we should check error log in the future.

@tnqn
Copy link
Member

tnqn commented Jul 10, 2023

This is because some validation logic is located at wrong places. I fixed this problem by moving them to the right place.

The statement is inaccurate. Not "moving" the code fix it, but making them return error does.

@shi0rik0 shi0rik0 closed this Jul 11, 2023
@shi0rik0
Copy link
Contributor Author

I created another PR #5230.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants