Fix traceflow spec validation logic. #5223

shi0rik0 · 2023-07-08T13:30:19Z

Currently, when some traceflow spec error happens, the status of such traceflows will be "Running", not "Failed". This is because some validation logic is located at wrong places. I fixed this problem by moving them to the right place.

luolanzone · 2023-07-10T07:13:57Z

Hi @shi0rik0 please fix the unit test failure.

Currently, when some traceflow spec error happens, the status of such traceflows will be "Running", not "Failed". This is because some validation logic is located at wrong places. I fixed this problem by moving them to the right place. Signed-off-by: shi0rik0 <anguuan@outlook.com>

shi0rik0 · 2023-07-10T07:56:11Z

Hi @shi0rik0 please fix the unit test failure.

I think there is an issue with the unit test. The existing logic is as follows: the controller performs the first validation on the spec, and then the agent performs the second validation on the spec. It would be sufficient to have this validation inside the controller. Therefore, these unit tests are redundant. A workaround is to still keep the validation logic in the agent.

gran-vmv

LGTM. Need to run full CI pipelines before merge.

shi0rik0 · 2023-07-10T08:48:20Z

/test-all

luolanzone · 2023-07-10T08:54:06Z

/test-all

tnqn

Thanks for the fix. However, I would suggest to validate it before creating it.

tnqn · 2023-07-10T08:59:54Z

pkg/agent/controller/traceflow/traceflow_controller.go

+	if tf.Spec.Source.Pod == "" && tf.Spec.Destination.Pod == "" {
+		return fmt.Errorf("Traceflow %s has neither source nor destination Pod specified", tf.Name)
+	}
+
+	if tf.Spec.Source.Pod == "" && !tf.Spec.LiveTraffic {
+		return fmt.Errorf("Traceflow %s does not have source Pod specified", tf.Name)
+	}


In fact these validations don't need to perform at runtime. Like other CRDs, they can be prevented from being created with CRD schema validation or webhook validation, which is more friendly to users. They can get immediate feedback when creating the resource and clear message what's the error, instead of having to polling the object to get the error.

I agree, but there are a couple of considerations:

The existing validation logic is performed at runtime, and if we want to switch to using CRD schema validation, it may require significant changes.

Expressing some complex validation logic using CRD schema can be challenging and may result in decreased YAML readability.

It's more ideal to prevent the creation of such invalid objects. Using CRD schema/wehbook validation is not that complex, and I reckon Traceflow is probably the only CRD doing such validation at runtime instead of pre-creation, due to historical reasons. But I'm fine to keep the patch as is if you just want the PR go this far.

Yes, implementing the current validation via CRD schema may be not practicable, that's why I mentioned "webhook validation" too.

This is the first time I heard of "webhook validation". It seems like it's a good idea to validate Traceflow CRD using webhook validation. If you wish, I'd like to refactor the validating logic to webhook validation.

Switching to webhook validation makes sense to me. You can start from https://github.com/antrea-io/antrea/blob/main/build/charts/antrea/templates/webhooks/validating/crdvalidator.yaml, and search the path "/validate/egress" in the code base to get some references.

tnqn · 2023-07-10T10:30:24Z

pkg/agent/controller/traceflow/traceflow_controller.go

+	if tf.Spec.Source.Pod == "" && tf.Spec.Destination.Pod == "" {
+		return fmt.Errorf("Traceflow %s has neither source nor destination Pod specified", tf.Name)
+	}
+
+	if tf.Spec.Source.Pod == "" && !tf.Spec.LiveTraffic {
+		return fmt.Errorf("Traceflow %s does not have source Pod specified", tf.Name)
+	}


It's more ideal to prevent the creation of such invalid objects. Using CRD schema/wehbook validation is not that complex, and I reckon Traceflow is probably the only CRD doing such validation at runtime instead of pre-creation, due to historical reasons. But I'm fine to keep the patch as is if you just want the PR go this far.

Yes, implementing the current validation via CRD schema may be not practicable, that's why I mentioned "webhook validation" too.

tnqn · 2023-07-10T10:35:00Z

pkg/agent/controller/traceflow/traceflow_controller_test.go

@@ -625,7 +625,8 @@ func TestStartTraceflow(t *testing.T) {
 			tf: &crdv1alpha1.Traceflow{
 				ObjectMeta: metav1.ObjectMeta{Name: "tf3", UID: "uid3"},
 			},
-			expectedErrLog: "Traceflow tf3 has neither source nor destination Pod specified",


I think this parameter can be removed now given the only two cases using it are changed, and there doesn't seem a reason why we should check error log in the future.

tnqn · 2023-07-10T10:37:26Z

This is because some validation logic is located at wrong places. I fixed this problem by moving them to the right place.

The statement is inaccurate. Not "moving" the code fix it, but making them return error does.

shi0rik0 · 2023-07-11T11:54:24Z

I created another PR #5230.

luolanzone requested a review from gran-vmv July 10, 2023 06:38

luolanzone added the kind/bug Categorizes issue or PR as related to a bug. label Jul 10, 2023

shi0rik0 force-pushed the tf-validate-fix branch from 5692299 to 5bcf7cf Compare July 10, 2023 07:56

gran-vmv approved these changes Jul 10, 2023

View reviewed changes

tnqn reviewed Jul 10, 2023

View reviewed changes

shi0rik0 closed this Jul 11, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix traceflow spec validation logic. #5223

Fix traceflow spec validation logic. #5223

shi0rik0 commented Jul 8, 2023

luolanzone commented Jul 10, 2023

shi0rik0 commented Jul 10, 2023

gran-vmv left a comment

shi0rik0 commented Jul 10, 2023

luolanzone commented Jul 10, 2023

tnqn left a comment

tnqn Jul 10, 2023

shi0rik0 Jul 10, 2023

tnqn Jul 10, 2023

shi0rik0 Jul 10, 2023

shi0rik0 Jul 11, 2023

tnqn Jul 11, 2023

tnqn Jul 10, 2023

tnqn Jul 10, 2023

tnqn commented Jul 10, 2023

shi0rik0 commented Jul 11, 2023

Fix traceflow spec validation logic. #5223

Fix traceflow spec validation logic. #5223

Conversation

shi0rik0 commented Jul 8, 2023

luolanzone commented Jul 10, 2023

shi0rik0 commented Jul 10, 2023

gran-vmv left a comment

Choose a reason for hiding this comment

shi0rik0 commented Jul 10, 2023

luolanzone commented Jul 10, 2023

tnqn left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tnqn commented Jul 10, 2023

shi0rik0 commented Jul 11, 2023