BugFix: Agent NetworkPolicyController handle ipBlocks in NP update. #1625

suwang48404 · 2020-12-07T19:35:10Z

No description provided.

antrea-bot · 2020-12-07T19:35:25Z

Thanks for your PR.
Unit tests and code linters are run automatically every time the PR is updated.
E2e, conformance and network policy tests can only be triggered by a member of the vmware-tanzu organization. Regular contributors to the project should join the org.

The following commands are available:

/test-e2e: to trigger e2e tests.
/skip-e2e: to skip e2e tests.
/test-conformance: to trigger conformance tests.
/skip-conformance: to skip conformance tests.
/test-all-features-conformance: to trigger conformance tests with all alpha features enabled.
/skip-all-features-conformance: to skip conformance tests with all alpha features enabled.
/test-whole-conformance: to trigger all conformance tests on linux.
/skip-whole-conformance: to skip all conformance tests on linux.
/test-networkpolicy: to trigger networkpolicy tests.
/skip-networkpolicy: to skip networkpolicy tests.
/test-windows-conformance: to trigger windows conformance tests.
/skip-windows-conformance: to skip windows conformance tests.
/test-windows-networkpolicy: to trigger windows networkpolicy tests.
/skip-windows-networkpolicy: to skip windows networkpolicy tests.
/test-hw-offload: to trigger ovs hardware offload test.
/skip-hw-offload: to skip ovs hardware offload test.
/test-all: to trigger all tests (except whole conformance).
/skip-all: to skip all tests (except whole conformance).

codecov-io · 2020-12-07T20:13:28Z

Codecov Report

Merging #1625 (9be2744) into master (9d3d10b) will decrease coverage by 1.61%.
The diff coverage is 53.40%.

@@            Coverage Diff             @@
##           master    #1625      +/-   ##
==========================================
- Coverage   63.31%   61.70%   -1.62%     
==========================================
  Files         170      181      +11     
  Lines       14250    15430    +1180     
==========================================
+ Hits         9023     9521     +498     
- Misses       4292     4875     +583     
- Partials      935     1034      +99

Flag	Coverage Δ
kind-e2e-tests	`51.84% <50.17%> (-3.55%)`	⬇️
unit-tests	`40.49% <16.90%> (-0.78%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
cmd/antrea-agent/agent.go	`0.00% <0.00%> (ø)`
cmd/antrea-agent/options.go	`26.89% <0.00%> (+6.17%)`	⬆️
.../agent/apiserver/handlers/networkpolicy/handler.go	`58.33% <ø> (ø)`
...gent/controller/noderoute/node_route_controller.go	`46.06% <ø> (-0.41%)`	⬇️
.../agent/flowexporter/connections/conntrack_linux.go	`29.48% <0.00%> (+0.32%)`	⬆️
pkg/agent/stats/collector.go	`97.72% <ø> (ø)`
pkg/antctl/antctl.go	`100.00% <ø> (ø)`
pkg/antctl/transform/addressgroup/transform.go	`0.00% <0.00%> (ø)`
pkg/antctl/transform/controllerinfo/transform.go	`0.00% <ø> (ø)`
pkg/antctl/transform/version/transform.go	`0.00% <ø> (ø)`
... and 79 more

antoninbas · 2020-12-07T20:20:00Z

pkg/agent/controller/networkpolicy/reconciler.go

+				if len(newRule.To.IPBlocks) > 0 {
+					to := ipBlocksToOFAddresses(newRule.To.IPBlocks, r.ipv4Enabled, r.ipv6Enabled)
+					ofRule.To = append(ofRule.To, to...)
+				}


@suwang48404 if this is a bug fix, can we add a test case for this?

/cc @Dyanngg

@Dyanngg I saw the problem when working on a separate product. I think in my setup, the initial add rule failed due to underlying system not ready, then sequent action becomes update, and this problem was observed.

But even just by eye balling the code, it seems update missed ipBlock in egress direction, no?

@antoninbas not sure what this bug may be triggered externally. It is not normal work flow.

This might be by design. If the ipBlock CIDR is not changed before/after the rule update (rule update can happen when the rule itself did not change, be the pods it selects is added/deleted etc.), we do not need to process the the ipBlock in update function. On the other hand, if a ipBlock CIDR is added/removed/modified in a rule, it becomes a NEW RULE so the update logic won't be involved here. Hence, I tend to believe the current logic is solid.

u r correct, the IPBLock has not been changed before/after the rule change because there is no rule change. The reason updateRule is called is because initial AddRule failed due to some underlying system not ready, and retry logic calls updateRule to (re)install the rule.

And when rule is not created initially and update is used for retry, and egress rule has ip blocks, we will have a problem, no?

also I put in the fix because I experience the problem, unless u guys tell me that there is change made in v0.11 such that updateRule is not being called if initially addRule fails, If that is the case, perhaps if !exist{} block in updateRule should have been removed, perhaps?

@suwang48404 thanks for catching this. However it's not supposed to append the IPBlock to every "member group". The reason why we call "groupMembersByServices" is to make Pods that have same named port definition fall into same group so that they can share openflows. As IPBlock cannot resolve any named port, they should share the openflows with the pods that cannot resolve any named port. See the logic in https://github.com/vmware-tanzu/antrea/blob/370e2f4e011da38b0559adca702243646b5bbc48/pkg/agent/controller/networkpolicy/reconciler.go#L458-L486.

hi @tnqn just to clarify, the bug is valid. But fix is not correct because conjunction flow need to share same matching criteria. We will need to come up with a better fix, right?

in that case, perhaps I should close this PR and file an issue instead? Is this something hard to fix? and timeline wise, how soon can this be done?

hi @tnqn just to clarify, the bug is valid. But fix is not correct because conjunction flow need to share same matching criteria. We will need to come up with a better fix, right?

Yes, the bug is valid but the fix will cause another issue.

in that case, perhaps I should close this PR and file an issue instead? Is this something hard to fix? and timeline wise, how soon can this be done?

I'm fine if you want to improve the fix in this PR or file the issue and assign it to me. I think it's not hard to fix but we do need to improve unit test to cover this scenario. I think we can get the fix in 0.12 which is supposed to next week, does the timeline work for you?

@tnqn 0.12 next week sounds super good. Thx for helping out !!

build/yamls/base/crds.yml

Dyanngg

Hi @suwang48404 Could you update the PR description on how to reproduce this bug?
I tested with the ToT build using the following spec:

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: egress-np
spec:
  podSelector:
    matchLabels:
      app: web
  policyTypes:
  - Egress
  egress:
  - to:
    - podSelector:
        matchLabels:
          app: client
    - ipBlock:
        cidr: "10.10.1.17/32"
    ports:
    - protocol: TCP
      port: https

where client Pod has an IP address of 10.10.1.16 and port https maps to 443.
I applied this spec, deleted the client Pod to force a rule update event, and the result of rule update looks correct to me.

Before client Pod deletion:

kubectl exec -it antrea-agent-zrvnl -n kube-system -c antrea-ovs -- ovs-ofctl dump-flows br-int table=50
 cookie=0x4000000000000, duration=1152.720s, table=50, n_packets=4530, n_bytes=1086677, priority=210,ct_state=-new+est,ip actions=resubmit(,61)
 cookie=0x4050000000000, duration=4.992s, table=50, n_packets=0, n_bytes=0, priority=200,ip,nw_src=10.10.1.13 actions=conjunction(1,1/3),conjunction(2,1/3)
 cookie=0x4050000000000, duration=4.992s, table=50, n_packets=0, n_bytes=0, priority=200,ip,nw_src=10.10.1.14 actions=conjunction(1,1/3),conjunction(2,1/3)
 cookie=0x4050000000000, duration=4.992s, table=50, n_packets=0, n_bytes=0, priority=200,ip,nw_dst=10.10.1.16 actions=conjunction(2,2/3)
 cookie=0x4050000000000, duration=4.983s, table=50, n_packets=0, n_bytes=0, priority=200,ip,nw_dst=10.10.1.17 actions=conjunction(1,2/3)
 cookie=0x4050000000000, duration=4.992s, table=50, n_packets=0, n_bytes=0, priority=200,tcp,tp_dst=443 actions=conjunction(2,3/3)
 cookie=0x4050000000000, duration=4.993s, table=50, n_packets=0, n_bytes=0, priority=190,conj_id=2,ip actions=load:0x2->NXM_NX_REG5[],ct(commit,table=61,zone=65520,exec(load:0x2->NXM_NX_CT_LABEL[32..63]))
 cookie=0x4050000000000, duration=4.985s, table=50, n_packets=0, n_bytes=0, priority=190,conj_id=1,ip actions=load:0x1->NXM_NX_REG5[],ct(commit,table=61,zone=65520,exec(load:0x1->NXM_NX_CT_LABEL[32..63]))
 cookie=0x4000000000000, duration=1152.731s, table=50, n_packets=231, n_bytes=17094, priority=0 actions=resubmit(,60)

Something I found alarming is that it seems the policy has been interpreted as 'allow egress traffic to client Pods on port https, and allow egress traffic to 10.10.1.17/32 on any port', which I'm not sure if is correct/by design. cc @abhiraut @tnqn @wenyingd
That being said, after an event which triggers rule update (client Pod deleted), the flows get updated correctly:

kubectl exec -it antrea-agent-zrvnl -n kube-system -c antrea-ovs -- ovs-ofctl dump-flows br-int table=50
 cookie=0x4000000000000, duration=1516.208s, table=50, n_packets=5963, n_bytes=1430937, priority=210,ct_state=-new+est,ip actions=resubmit(,61)
 cookie=0x4050000000000, duration=368.480s, table=50, n_packets=0, n_bytes=0, priority=200,ip,nw_src=10.10.1.14 actions=conjunction(1,1/3)
 cookie=0x4050000000000, duration=368.480s, table=50, n_packets=0, n_bytes=0, priority=200,ip,nw_src=10.10.1.13 actions=conjunction(1,1/3)
 cookie=0x4050000000000, duration=368.471s, table=50, n_packets=0, n_bytes=0, priority=200,ip,nw_dst=10.10.1.17 actions=conjunction(1,2/3)
 cookie=0x4050000000000, duration=368.473s, table=50, n_packets=0, n_bytes=0, priority=190,conj_id=1,ip actions=load:0x1->NXM_NX_REG5[],ct(commit,table=61,zone=65520,exec(load:0x1->NXM_NX_CT_LABEL[32..63]))
 cookie=0x4000000000000, duration=1516.219s, table=50, n_packets=303, n_bytes=22422, priority=0 actions=resubmit(,60)

and the isolation rules on table 60 also checked out. I also experimented with egress rule having regular port as opposed to named port, and rule updates also seem to work as expected.

build/yamls/base/crds.yml

suwang48404 · 2020-12-07T23:18:55Z

Hi @suwang48404 Could you update the PR description on how to reproduce this bug?
I tested with the ToT build using the following spec:

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: egress-np
spec:
  podSelector:
    matchLabels:
      app: web
  policyTypes:
  - Egress
  egress:
  - to:
    - podSelector:
        matchLabels:
          app: client
    - ipBlock:
        cidr: "10.10.1.17/32"
    ports:
    - protocol: TCP
      port: https

where client Pod has an IP address of 10.10.1.16 and port https maps to 443.
I applied this spec, deleted the client Pod to force a rule update event, and the result of rule update looks correct to me.

Before client Pod deletion:

kubectl exec -it antrea-agent-zrvnl -n kube-system -c antrea-ovs -- ovs-ofctl dump-flows br-int table=50
 cookie=0x4000000000000, duration=1152.720s, table=50, n_packets=4530, n_bytes=1086677, priority=210,ct_state=-new+est,ip actions=resubmit(,61)
 cookie=0x4050000000000, duration=4.992s, table=50, n_packets=0, n_bytes=0, priority=200,ip,nw_src=10.10.1.13 actions=conjunction(1,1/3),conjunction(2,1/3)
 cookie=0x4050000000000, duration=4.992s, table=50, n_packets=0, n_bytes=0, priority=200,ip,nw_src=10.10.1.14 actions=conjunction(1,1/3),conjunction(2,1/3)
 cookie=0x4050000000000, duration=4.992s, table=50, n_packets=0, n_bytes=0, priority=200,ip,nw_dst=10.10.1.16 actions=conjunction(2,2/3)
 cookie=0x4050000000000, duration=4.983s, table=50, n_packets=0, n_bytes=0, priority=200,ip,nw_dst=10.10.1.17 actions=conjunction(1,2/3)
 cookie=0x4050000000000, duration=4.992s, table=50, n_packets=0, n_bytes=0, priority=200,tcp,tp_dst=443 actions=conjunction(2,3/3)
 cookie=0x4050000000000, duration=4.993s, table=50, n_packets=0, n_bytes=0, priority=190,conj_id=2,ip actions=load:0x2->NXM_NX_REG5[],ct(commit,table=61,zone=65520,exec(load:0x2->NXM_NX_CT_LABEL[32..63]))
 cookie=0x4050000000000, duration=4.985s, table=50, n_packets=0, n_bytes=0, priority=190,conj_id=1,ip actions=load:0x1->NXM_NX_REG5[],ct(commit,table=61,zone=65520,exec(load:0x1->NXM_NX_CT_LABEL[32..63]))
 cookie=0x4000000000000, duration=1152.731s, table=50, n_packets=231, n_bytes=17094, priority=0 actions=resubmit(,60)

Something I found alarming is that it seems the policy has been interpreted as 'allow egress traffic to client Pods on port https, and allow egress traffic to 10.10.1.17/32 on any port', which I'm not sure if is correct/by design. cc @abhiraut @tnqn @wenyingd
That being said, after an event which triggers rule update (client Pod deleted), the flows get updated correctly:

kubectl exec -it antrea-agent-zrvnl -n kube-system -c antrea-ovs -- ovs-ofctl dump-flows br-int table=50
 cookie=0x4000000000000, duration=1516.208s, table=50, n_packets=5963, n_bytes=1430937, priority=210,ct_state=-new+est,ip actions=resubmit(,61)
 cookie=0x4050000000000, duration=368.480s, table=50, n_packets=0, n_bytes=0, priority=200,ip,nw_src=10.10.1.14 actions=conjunction(1,1/3)
 cookie=0x4050000000000, duration=368.480s, table=50, n_packets=0, n_bytes=0, priority=200,ip,nw_src=10.10.1.13 actions=conjunction(1,1/3)
 cookie=0x4050000000000, duration=368.471s, table=50, n_packets=0, n_bytes=0, priority=200,ip,nw_dst=10.10.1.17 actions=conjunction(1,2/3)
 cookie=0x4050000000000, duration=368.473s, table=50, n_packets=0, n_bytes=0, priority=190,conj_id=1,ip actions=load:0x1->NXM_NX_REG5[],ct(commit,table=61,zone=65520,exec(load:0x1->NXM_NX_CT_LABEL[32..63]))
 cookie=0x4000000000000, duration=1516.219s, table=50, n_packets=303, n_bytes=22422, priority=0 actions=resubmit(,60)

and the isolation rules on table 60 also checked out. I also experimented with egress rule having regular port as opposed to named port, and rule updates also seem to work as expected.

@wenyingd @Dyanngg In my setup, the initial rule creation has failed, the egress update path take if !exists {}, is ur test cover that path too? It seems odd if it works, because as u see ipBlock is not added, how could it work?

Dyanngg · 2020-12-07T23:57:22Z

@wenyingd @Dyanngg In my setup, the initial rule creation has failed, the egress update path take if !exists {}, is ur test cover that path too? It seems odd if it works, because as u see ipBlock is not added, how could it work?

See #1625 (comment). I think the initial rule creation has failed is something we should look into instead.

suwang48404 · 2020-12-08T00:28:16Z

@wenyingd @Dyanngg In my setup, the initial rule creation has failed, the egress update path take if !exists {}, is ur test cover that path too? It seems odd if it works, because as u see ipBlock is not added, how could it work?

See #1625 (comment). I think the initial rule creation has failed is something we should look into instead.

I repurposed networkPolicyController package on another product, it is not on antrea per se. In our use case, underlying ovs up and down are normal scope of operations.

But regardless where the use case may be, the NetworkPolicyController logic today seems able to handle rule creation failures, I think this fix just make it more solid, no? Or maybe u have different perspective why this fix should not be in?

Dyanngg · 2020-12-08T01:50:31Z

@wenyingd @Dyanngg In my setup, the initial rule creation has failed, the egress update path take if !exists {}, is ur test cover that path too? It seems odd if it works, because as u see ipBlock is not added, how could it work?

See #1625 (comment). I think the initial rule creation has failed is something we should look into instead.

I repurpose networkPolicyController package on another product, it is not on antrea per se. But regardless where the use case may be, it seems NetworkPolicyController logic today seems able to handle rule creation failures, I think the fix just make it more solid, no? Or maybe u have different perspective why this fix should not be in?

I hear your point on this makes it more solid. On the other hand, given the premise that rule add is handled correctly (if an exception occurred in Antrea, reconcile should have been retried), processing ipBlock in update function might be redundant and lead to performance cost. I'll let @tnqn and @wenyingd comment here since I did not implement this function in the first hand.

suwang48404 · 2020-12-08T02:01:06Z

@wenyingd @Dyanngg In my setup, the initial rule creation has failed, the egress update path take if !exists {}, is ur test cover that path too? It seems odd if it works, because as u see ipBlock is not added, how could it work?

See #1625 (comment). I think the initial rule creation has failed is something we should look into instead.

I repurpose networkPolicyController package on another product, it is not on antrea per se. But regardless where the use case may be, it seems NetworkPolicyController logic today seems able to handle rule creation failures, I think the fix just make it more solid, no? Or maybe u have different perspective why this fix should not be in?

I hear your point on this makes it more solid. On the other hand, given the premise that rule add is handled correctly (if an exception occurred in Antrea, reconcile should have been retried), processing ipBlock in update function might be redundant and lead to performance cost. I'll let @tnqn and @wenyingd comment here since I did not implement this function in the first hand.

To summarize, if addRule fails for any reason (maybe ovs is not ready, maybe something else), the retry logic will call updateRule, and if the rule is egress and consists of ipBlocks, the dst ipBlock will be ignored, That is the bug this PR is trying to address and it is applicable to antrea agent in CNI deployment or other use cases.

suwang48404 · 2020-12-08T17:22:17Z

@jianjuns @abhiraut @tnqn can u help review this.

To summarize, if addRule fails for any reason (maybe ovs is not ready, maybe something else), the retry logic will call updateRule, and if the rule is egress and consists of ipBlocks, the dst ipBlock will be ignored, That is the bug this PR is trying to address and it is applicable to antrea agent in CNI deployment or other use cases.

Dyanngg · 2020-12-08T18:26:25Z

@jianjuns @abhiraut @tnqn can u help review this.

To summarize, if addRule fails for any reason (maybe ovs is not ready, maybe something else), the retry logic will call updateRule, and if the rule is egress and consists of ipBlocks, the dst ipBlock will be ignored, That is the bug this PR is trying to address and it is applicable to antrea agent in CNI deployment or other use cases.

I don't think the retry logic will call update for the rule. If you look at L402

// Record ofID only if its Openflow is installed successfully.
lastRealized.ofIDs[svcKey] = ofRule.FlowID

When installation fails, lastRealized should not contain ruleID, and next time rule is synced, !exist will evaluate to true.

suwang48404 · 2020-12-08T19:50:57Z

@jianjuns @abhiraut @tnqn can u help review this.
To summarize, if addRule fails for any reason (maybe ovs is not ready, maybe something else), the retry logic will call updateRule, and if the rule is egress and consists of ipBlocks, the dst ipBlock will be ignored, That is the bug this PR is trying to address and it is applicable to antrea agent in CNI deployment or other use cases.

I don't think the retry logic will call update for the rule. If you look at L402
// Record ofID only if its Openflow is installed successfully.
lastRealized.ofIDs[svcKey] = ofRule.FlowID
When installation fails, lastRealized should not contain ruleID, and next time rule is synced, !exist will evaluate to true.

I may not have been clear regarding to ofFlow rule and policy rule.
Let us walk through the steps:

an new egress policy rule with dst IP block was added via add()(L391)
the policy rule is added to cache via computeOFRulesForAdd(L413)
in case that ovs is not ready, add() returns error in L400 without adding ofFlow rule to cache.
in next reconcile cycle, update() (L256) is called because policy rule has been added in step 2 above, and therefore exists.
a new ofLow rule is created in update() (L604) because *ofFlow rule" is not installed therefore not in cache, and dst IP Block will be ignored. Hence the bug.

Hope this helps, and let me know if I missed anything.

Thx, Su

Dyanngg · 2020-12-08T21:40:01Z

I may not have been clear regarding to ofFlow rule and policy rule.
Let us walk through the steps:

an new egress policy rule with dst IP block was added via add()(L391)

the policy rule is added to cache via computeOFRulesForAdd(L413)

in case that ovs is not ready, add() returns error in L400 without adding ofFlow rule to cache.

in next reconcile cycle, update() (L256) is called because policy rule has been added in step 2 above, and therefore exists.

a new ofLow rule is created in update() (L604) because *ofFlow rule" is not installed therefore not in cache, and dst IP Block will be ignored. Hence the bug.

Hope this helps, and let me know if I missed anything.

Thx, Su

Thanks Su for walking through this. So yes this is an issue which is hard to write a testcase for. I'm okay with this change except that I believe to fix it more properly, we need to address the TODO in L411.

// TODO: Handle the case that the following processing fails or partially succeeds.
r.lastRealizeds.Store(rule.ID, lastRealized)

That involves removing rule.ID from lastRealized if flow installation failed at any point. @tnqn

suwang48404 · 2020-12-08T21:55:40Z

I may not have been clear regarding to ofFlow rule and policy rule.
Let us walk through the steps:

an new egress policy rule with dst IP block was added via add()(L391)

the policy rule is added to cache via computeOFRulesForAdd(L413)

in case that ovs is not ready, add() returns error in L400 without adding ofFlow rule to cache.

in next reconcile cycle, update() (L256) is called because policy rule has been added in step 2 above, and therefore exists.

a new ofLow rule is created in update() (L604) because *ofFlow rule" is not installed therefore not in cache, and dst IP Block will be ignored. Hence the bug.

Hope this helps, and let me know if I missed anything.
Thx, Su

Thanks Su for walking through this. So yes this is an issue which is hard to write a testcase for. I'm okay with this change except that I believe to fix it more properly, we need to address the TODO in L411.
// TODO: Handle the case that the following processing fails or partially succeeds.
r.lastRealizeds.Store(rule.ID, lastRealized)  
That involves removing rule.ID from lastRealized if flow installation failed at any point. @tnqn

I will let u guys decide what to do with 411. I simply fix a bug that for as long as a rule flow are created in update, the rule flow should include ipBlock, it seems just a obvious omission, seems irrespective of L411.

suwang48404 · 2020-12-08T21:59:02Z

I may not have been clear regarding to ofFlow rule and policy rule.
Let us walk through the steps:

an new egress policy rule with dst IP block was added via add()(L391)

the policy rule is added to cache via computeOFRulesForAdd(L413)

in case that ovs is not ready, add() returns error in L400 without adding ofFlow rule to cache.

in next reconcile cycle, update() (L256) is called because policy rule has been added in step 2 above, and therefore exists.

a new ofLow rule is created in update() (L604) because *ofFlow rule" is not installed therefore not in cache, and dst IP Block will be ignored. Hence the bug.

Hope this helps, and let me know if I missed anything.
Thx, Su

Thanks Su for walking through this. So yes this is an issue which is hard to write a testcase for. I'm okay with this change except that I believe to fix it more properly, we need to address the TODO in L411.
// TODO: Handle the case that the following processing fails or partially succeeds.
r.lastRealizeds.Store(rule.ID, lastRealized)  
That involves removing rule.ID from lastRealized if flow installation failed at any point. @tnqn
I will let u guys decide what to do with 411. I simply fix a bug that for as long as a rule flow are created in update, the rule flow should include ipBlock, it seems just a obvious omission, seems irrespective of L411.

BTW, I appreciate a detail review process and Q &A, which help produce good quality code. But let us try not to send people to some fishing expedition, that is just going to waste a lot time that we don't have.

antoninbas · 2020-12-08T23:53:26Z

BTW, I appreciate a detail review process and Q &A, which help produce good quality code. But let us try not to send people to some fishing expedition, that is just going to waste a lot time that we don't have.

Apologies, that's not our intention here, but quality and robustness is our priority, especially when it comes to a feature like NetworkPolicy. And it's important that the folks who maintain this code such as @Dyanngg have a good understanding of the changes.

Thanks for sharing the detailed sequence. Isn't is pretty straightforward to test this case, by returning an error in ofClient.InstallPolicyRuleFlows, given that this interface is mocked? It seems that a transient failure when adding OVS flows is something we should indeed accommodate for.

Dyanngg

I'm good with merging this and I will update the logic for lastRealized as well as adding testcase for this particular scenario

suwang48404 · 2020-12-09T01:09:06Z

I'm good with merging this and I will update the logic for lastRealized as well as adding testcase for this particular scenario

@Dyanngg @antoninbas thx for helping out on testing and further improvements.

suwang48404 · 2020-12-09T02:21:06Z

I'm good with merging this and I will update the logic for lastRealized as well as adding testcase for this particular scenario

@Dyanngg @antoninbas thx for helping out on testing and further improvements.

@Dyanngg is ur proposed change addresses issue fixed by this PR? I am assuming ur change would remove the need for update() to create new ofFlow rules? If yes and u plan to work on it promptly, maybe we don't need this PR merged, and instead waiting for ur change along with companion test(s)? Let me know.

tnqn

Thanks @suwang48404 for raising this bug, I left a comment at https://github.com/vmware-tanzu/antrea/pull/1625/files#r538961037, I guess it would cause duplicate conjunction clauses, mark it as "request changes" to hold the merge for now.

suwang48404 · 2020-12-09T04:04:20Z

@antoninbas @Dyanngg @tnqn

as the fix presented in PR is incomplete.

@tnqn has gracefully agree to help with the issue #1635.

closing this PR, and thank all for helping out.

vmwclabot added the cla-not-required label Dec 7, 2020

suwang48404 requested review from tnqn, Dyanngg and abhiraut December 7, 2020 19:35

antoninbas reviewed Dec 7, 2020

View reviewed changes

jianjuns reviewed Dec 7, 2020

View reviewed changes

build/yamls/base/crds.yml Outdated Show resolved Hide resolved

Dyanngg reviewed Dec 7, 2020

View reviewed changes

build/yamls/base/crds.yml Outdated Show resolved Hide resolved

BugFix: Agent NetworkPolicyController handle ipBlocks in NP update.

9be2744

suwang48404 force-pushed the applyee-merge branch from da969e2 to 9be2744 Compare December 8, 2020 17:20

Dyanngg approved these changes Dec 9, 2020

View reviewed changes

antoninbas approved these changes Dec 9, 2020

View reviewed changes

tnqn requested changes Dec 9, 2020

View reviewed changes

suwang48404 closed this Dec 9, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BugFix: Agent NetworkPolicyController handle ipBlocks in NP update. #1625

BugFix: Agent NetworkPolicyController handle ipBlocks in NP update. #1625

suwang48404 commented Dec 7, 2020

antrea-bot commented Dec 7, 2020

codecov-io commented Dec 7, 2020 •

edited

Loading

antoninbas Dec 7, 2020

suwang48404 Dec 7, 2020

suwang48404 Dec 7, 2020

Dyanngg Dec 7, 2020

suwang48404 Dec 8, 2020 •

edited

Loading

suwang48404 Dec 8, 2020 •

edited

Loading

tnqn Dec 9, 2020

suwang48404 Dec 9, 2020 •

edited

Loading

tnqn Dec 9, 2020

suwang48404 Dec 9, 2020

Dyanngg left a comment

suwang48404 commented Dec 7, 2020

Dyanngg commented Dec 7, 2020

suwang48404 commented Dec 8, 2020 •

edited

Loading

Dyanngg commented Dec 8, 2020

suwang48404 commented Dec 8, 2020 •

edited

Loading

suwang48404 commented Dec 8, 2020

Dyanngg commented Dec 8, 2020

suwang48404 commented Dec 8, 2020

Dyanngg commented Dec 8, 2020

suwang48404 commented Dec 8, 2020

suwang48404 commented Dec 8, 2020

antoninbas commented Dec 8, 2020

Dyanngg left a comment

suwang48404 commented Dec 9, 2020

suwang48404 commented Dec 9, 2020

tnqn left a comment

suwang48404 commented Dec 9, 2020

BugFix: Agent NetworkPolicyController handle ipBlocks in NP update. #1625

BugFix: Agent NetworkPolicyController handle ipBlocks in NP update. #1625

Conversation

suwang48404 commented Dec 7, 2020

antrea-bot commented Dec 7, 2020

codecov-io commented Dec 7, 2020 • edited Loading

Codecov Report

antoninbas Dec 7, 2020

Choose a reason for hiding this comment

suwang48404 Dec 7, 2020

Choose a reason for hiding this comment

suwang48404 Dec 7, 2020

Choose a reason for hiding this comment

Dyanngg Dec 7, 2020

Choose a reason for hiding this comment

suwang48404 Dec 8, 2020 • edited Loading

Choose a reason for hiding this comment

suwang48404 Dec 8, 2020 • edited Loading

Choose a reason for hiding this comment

tnqn Dec 9, 2020

Choose a reason for hiding this comment

suwang48404 Dec 9, 2020 • edited Loading

Choose a reason for hiding this comment

tnqn Dec 9, 2020

Choose a reason for hiding this comment

suwang48404 Dec 9, 2020

Choose a reason for hiding this comment

Dyanngg left a comment

Choose a reason for hiding this comment

suwang48404 commented Dec 7, 2020

Dyanngg commented Dec 7, 2020

suwang48404 commented Dec 8, 2020 • edited Loading

Dyanngg commented Dec 8, 2020

suwang48404 commented Dec 8, 2020 • edited Loading

suwang48404 commented Dec 8, 2020

Dyanngg commented Dec 8, 2020

suwang48404 commented Dec 8, 2020

Dyanngg commented Dec 8, 2020

suwang48404 commented Dec 8, 2020

suwang48404 commented Dec 8, 2020

antoninbas commented Dec 8, 2020

Dyanngg left a comment

Choose a reason for hiding this comment

suwang48404 commented Dec 9, 2020

suwang48404 commented Dec 9, 2020

tnqn left a comment

Choose a reason for hiding this comment

suwang48404 commented Dec 9, 2020

codecov-io commented Dec 7, 2020 •

edited

Loading

suwang48404 Dec 8, 2020 •

edited

Loading

suwang48404 Dec 8, 2020 •

edited

Loading

suwang48404 Dec 9, 2020 •

edited

Loading

suwang48404 commented Dec 8, 2020 •

edited

Loading

suwang48404 commented Dec 8, 2020 •

edited

Loading