Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BugFix: Agent NetworkPolicyController handle ipBlocks in NP update. #1625

Closed
wants to merge 1 commit into from
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions pkg/agent/controller/networkpolicy/reconciler.go
Original file line number Diff line number Diff line change
Expand Up @@ -297,6 +297,7 @@ func (r *reconciler) getOFPriority(rule *CompletedRule, table binding.TableIDTyp
RulePriority: rule.Priority,
}
ofPriority, registered := pa.assigner.GetOFPriority(p)

if !registered {
allPrioritiesInPolicy := make([]types.Priority, rule.MaxPriority+1)
for i := int32(0); i <= rule.MaxPriority; i++ {
Expand Down Expand Up @@ -612,6 +613,11 @@ func (r *reconciler) update(lastRealized *lastRealized, newRule *CompletedRule,
PolicyRef: newRule.SourceRef,
EnableLogging: newRule.EnableLogging,
}
if len(newRule.To.IPBlocks) > 0 {
to := ipBlocksToOFAddresses(newRule.To.IPBlocks, r.ipv4Enabled, r.ipv6Enabled)
ofRule.To = append(ofRule.To, to...)
}
Comment on lines +616 to +619
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@suwang48404 if this is a bug fix, can we add a test case for this?

/cc @Dyanngg

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Dyanngg I saw the problem when working on a separate product. I think in my setup, the initial add rule failed due to underlying system not ready, then sequent action becomes update, and this problem was observed.

But even just by eye balling the code, it seems update missed ipBlock in egress direction, no?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@antoninbas not sure what this bug may be triggered externally. It is not normal work flow.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be by design. If the ipBlock CIDR is not changed before/after the rule update (rule update can happen when the rule itself did not change, be the pods it selects is added/deleted etc.), we do not need to process the the ipBlock in update function. On the other hand, if a ipBlock CIDR is added/removed/modified in a rule, it becomes a NEW RULE so the update logic won't be involved here. Hence, I tend to believe the current logic is solid.

Copy link
Contributor Author

@suwang48404 suwang48404 Dec 8, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

u r correct, the IPBLock has not been changed before/after the rule change because there is no rule change. The reason updateRule is called is because initial AddRule failed due to some underlying system not ready, and retry logic calls updateRule to (re)install the rule.

And when rule is not created initially and update is used for retry, and egress rule has ip blocks, we will have a problem, no?

Copy link
Contributor Author

@suwang48404 suwang48404 Dec 8, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also I put in the fix because I experience the problem, unless u guys tell me that there is change made in v0.11 such that updateRule is not being called if initially addRule fails, If that is the case, perhaps if !exist{} block in updateRule should have been removed, perhaps?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@suwang48404 thanks for catching this. However it's not supposed to append the IPBlock to every "member group". The reason why we call "groupMembersByServices" is to make Pods that have same named port definition fall into same group so that they can share openflows. As IPBlock cannot resolve any named port, they should share the openflows with the pods that cannot resolve any named port. See the logic in https://github.com/vmware-tanzu/antrea/blob/370e2f4e011da38b0559adca702243646b5bbc48/pkg/agent/controller/networkpolicy/reconciler.go#L458-L486.

Copy link
Contributor Author

@suwang48404 suwang48404 Dec 9, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hi @tnqn just to clarify, the bug is valid. But fix is not correct because conjunction flow need to share same matching criteria. We will need to come up with a better fix, right?

in that case, perhaps I should close this PR and file an issue instead? Is this something hard to fix? and timeline wise, how soon can this be done?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hi @tnqn just to clarify, the bug is valid. But fix is not correct because conjunction flow need to share same matching criteria. We will need to come up with a better fix, right?

Yes, the bug is valid but the fix will cause another issue.

in that case, perhaps I should close this PR and file an issue instead? Is this something hard to fix? and timeline wise, how soon can this be done?

I'm fine if you want to improve the fix in this PR or file the issue and assign it to me. I think it's not hard to fix but we do need to improve unit test to cover this scenario. I think we can get the fix in 0.12 which is supposed to next week, does the timeline work for you?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tnqn 0.12 next week sounds super good. Thx for helping out !!


err := r.idAllocator.allocateForRule(ofRule)
if err != nil {
return fmt.Errorf("error allocating Openflow ID")
Expand Down