-
Notifications
You must be signed in to change notification settings - Fork 386
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix race conditions in NetworkPolicyController #4028
Conversation
Codecov Report
@@ Coverage Diff @@
## main #4028 +/- ##
==========================================
+ Coverage 63.65% 65.52% +1.87%
==========================================
Files 300 304 +4
Lines 46567 46604 +37
==========================================
+ Hits 29640 30538 +898
+ Misses 14596 13658 -938
- Partials 2331 2408 +77
|
6855b40
to
30ff4d7
Compare
/test-all |
/test-all |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not super familiar with this code at this stage, but I took a stab at a review
@@ -24,19 +24,22 @@ import ( | |||
antreatypes "antrea.io/antrea/pkg/controller/types" | |||
) | |||
|
|||
func getAntreaNetworkPolicyReference(anp *crdv1alpha1.NetworkPolicy) controlplane.NetworkPolicyReference { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm curious about why we are not returning a pointer here (*controlplane.NetworkPolicyReference
)?
Ultimately we use it as key for internalNetworkPolicyQueue
. I don't know if this causes unnecessary copies, or if on the contrary less indirection is better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
internalNetworkPolicyQueue
must use value instead of pointer as the key, otherwise the same NetworkPolicies will not be treated as same item because the pointers may be different. But other places could use pointers to save some copy overhead. I have updated them.
@@ -132,7 +88,7 @@ func (n *NetworkPolicyController) processAntreaNetworkPolicy(np *crdv1alpha1.Net | |||
// Create AppliedToGroup for each AppliedTo present in AntreaNetworkPolicy spec. | |||
for _, at := range np.Spec.AppliedTo { | |||
appliedToGroupNamesSet.Insert(n.createAppliedToGroup( | |||
np.Namespace, at.PodSelector, at.NamespaceSelector, at.ExternalEntitySelector)) | |||
networkPolicyRef, np.Namespace, at.PodSelector, at.NamespaceSelector, at.ExternalEntitySelector)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: maybe we could remove the second parameter of createAppliedToGroup
if the namespace name can be easily retrieved from networkPolicyRef
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
made some changes to how to track references. Now only policy UID is required for reference, so namespace is still passed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed to ensure atomicity when updating internal NetworkPolicy and creating or deleting AddressGroups and AppliedToGroups, no extra argument is needed.
_, exists, _ := c.appliedToGroupStore.Get(cg) | ||
if exists { | ||
c.enqueueAppliedToGroup(cg) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is it possible for the AppliedToGroup to be deleted between the call to get and the call to enqueue? I assume it is possible and that it's not an issue, but a comment to that effect would be nice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added
key, _ := npc.internalNetworkPolicyQueue.Get() | ||
npc.internalNetworkPolicyQueue.Done(key) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe for the clarity of the test, we should check that the keys match {getAntreaClusterNetworkPolicyReference(cnp1), getAntreaClusterNetworkPolicyReference(cnp2)}
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added
// cnp2 is added after the ClusterGroup. | ||
npc.addCNP(cnp2) | ||
|
||
// cnp1 is added before the ClusterGroup. The rule's From should be empty as the ClusterGroup hasn't been synced, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I find the "added" terminology a bit confusing here, since we don't call the add handler anymore. Maybe just "synced"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
_, npc := newController() | ||
cnp := getCNP() | ||
newCNP := cnp.DeepCopy() | ||
newCNP.Spec.Priority = float64(100) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could you add a comment explaining why we set the priority to this value here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added
_, npc := newController() | ||
anp := getANP() | ||
newANP := anp.DeepCopy() | ||
newANP.Spec.Priority = float64(1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could you add a comment explaining why we set the priority to this value here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added
@@ -428,7 +428,7 @@ func TestToAntreaPeerForCRD(t *testing.T) { | |||
_, npc := newController() | |||
npc.addClusterGroup(&cgA) | |||
npc.cgStore.Add(&cgA) | |||
actualPeer := npc.toAntreaPeerForCRD(tt.inPeers, testCNPObj, tt.direction, tt.namedPortExists) | |||
actualPeer := npc.toAntreaPeerForCRD(getAntreaClusterNetworkPolicyReference(testCNPObj), tt.inPeers, testCNPObj, tt.direction, tt.namedPortExists) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just thought about this: could we use getACNPReference
/ getANPReference
as function names? It gets pretty long to read and we already use the ACNP / ANP abbreviations in other function names
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated
// addressGroupReferences tracks the reference count of policies for AddressGroups. | ||
addressGroupReferences map[string]networkPolicyReferences | ||
// appliedToGroupMutex prevents race conditions between multiple internalNetworkPolicyWorkers when they create or | ||
// delete same AppliedToGroups and ensures atomicity of updating appliedToGroupStore and appliedToGroupReferences. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
delete the same
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
// delete same AppliedToGroups and ensures atomicity of updating appliedToGroupStore and appliedToGroupReferences. | ||
appliedToGroupMutex sync.Mutex | ||
// addressGroupMutex prevents race conditions between multiple internalNetworkPolicyWorkers when they create or | ||
// delete same AddressGroups and ensures atomicity of updating addressGroupStore and addressGroupReferences. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
delete the same
there is also an extra whitespace
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
/skip-integration tested manually |
Regarding PR #4047, has updated documentation in the |
/test-all |
for _, p := range parentGroupObjs { | ||
parentGrp := p.(*antreatypes.Group) | ||
c.enqueueInternalGroup(parentGrp.SourceReference.ToGroupName()) | ||
} | ||
} | ||
|
||
// triggerDerivedGroupUpdates triggers processing of AppliedToGroup and AddressGroup derived from the provided group. | ||
func (c *NetworkPolicyController) triggerDerivedGroupUpdates(grp string) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This replaces the previous group enqueue operations in reprocessCNP
, which blindly enqueues all AppliedToGroups and AddressGroups.
/test-all |
80326a0
to
4844c6e
Compare
/test-all |
/test-all |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM< I think @GraysonWu should take another look as well
// It must use value instead of pointer as the key, otherwise the same NetworkPolicies will not be treated as same | ||
// item because the pointers may be different. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for adding this comment
// in case of ADD event or modified and store the updated instance, in case | ||
// of an UPDATE event. | ||
func (n *NetworkPolicyController) processAntreaNetworkPolicy(np *crdv1alpha1.NetworkPolicy) *antreatypes.NetworkPolicy { | ||
// instance to the caller wherein. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove "wherein", which is no longer valid
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Just curious: is there a way to "test" a race condition?
Good question, I added an unit test
Although the code before refactoring didn't create and delete groups in |
/skip-all the latest change only added an unit test and updated a comment. |
@antoninbas @GraysonWu could you give another approval? |
There were a few race conditions in NetworkPolicyController: * An AppliedToGroup or AddressGroup in use may be removed if situations like this happens: 1. addANP creates a group for ANP A; 2. addNetworkPolicy reuses the group for KNP B, is going to create an internal NetworkPolicy; 3. deleteANP deletes the group for ANP A because at that moment no other internal NetworkPolicies are using the group; 4. addNetworkPolicy commits the internal NetworkPolicy for KNP B to storage, but the group no longer exists. * An Antrea-native NetworkPolicy may be out-of-date if situations like this happens: 1. An ACNP event is received, `updateCNP` calculates the new internal NetworkPolicy for the ACNP, is going to commit it to storage; 2. A ClusterGroup event triggers update of the ACNP via triggerCNPUpdates 3. triggerCNPUpdates calls reprocessCNP which updates the new internal NetworkPolicy for the ACNP and commits it to storage; 4. updateCNP in the first step commits its internal NetworkPolicy to storage which overrides the update of the ClusterGroup event. The second one caused test flake of the test case "TestGroupNoK8sNP/Case=ACNPNestedClusterGroup". To resolve the race conditions completely and make NetworkPolicy handling less error prone, this patch refactors NetworkPolicyController: * Event handlers no longer update the storage of internal NetworkPolicy directly and only triggers resync of affected policies, which ensures that there is at most one worker handling an internal NetworkPolicy at any moment. * Ensure atomicity when updating internal NetworkPolicy and creating or deleting AddressGroups and AppliedToGroups. Duplicate code and tests are deleted with the refactoring. Signed-off-by: Quan Tian <qtian@vmware.com>
return "", nil | ||
// Internal Group is not found, which means the corresponding namespaced group is either not created yet or not | ||
// processed yet. Once the internal Group is created and processed, the sync worker for internal group will | ||
// re-enqueue the ClusterNetworkPolicy processing which will trigger the creation of AppliedToGroup. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// re-enqueue the ClusterNetworkPolicy processing which will trigger the creation of AppliedToGroup. | |
// re-enqueue the AntreaNetworkPolicy processing which will trigger the creation of AppliedToGroup. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for pointing it out, will fix it in other PR
/skip-all |
@tnqn Sorry for the late review. I added several questions about locking in the comments, and have a general question: why earlier we chose to process related NPs in event handlers directly? Any side effects (performance, etc.) with the new approach of enqueuing NPs? |
@jianjuns I don't find your questions about locking in the comments. Have you published them?
It's because we only support K8s NetworkPolicy when designing the workflow, it worked fine and was more efficient to calculate spec in event handlers and calculate span in workers separately. But with Antrea-native policies, groups features added, the workflow lost its advantage and became complex and error prone: When we only support K8s NetworkPolicy:
The above two facts changed after NetworkPolicyController started to support Antrea-native policies and groups:
Mutex need to be used very carefully to avoid race conditions, for example, even just internal NetworkPolicy, they may be written by the following goroutines:
But even mutex is used properly, there will be a bottleneck for all the above executors, only one of them could execute the code block that may create/update/delete InternalNetworkPolicy, AddressGroup, AppliedToGroup. The previous code's critical section was not big enough to avoid race conditions, hence the errors. To fix it in old way, more code block need to be executed with lock acquired. And the code was quite redundant, this can be told by the lines of code change: the refactoring added 1,504 lines of code and deleted 2,735 lines of code, and the unit test coverage of this package was still improved from 58.2% to 63.3%. I just created issue #4127 to describe the race conditions and why a refactoring is necessary for record.
In theory, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Quan for the detailed explanation. All make sense to me.
Yes seems I forgot to publish the previous comments.
n.internalNetworkPolicyMutex.Lock() | ||
defer n.internalNetworkPolicyMutex.Unlock() | ||
|
||
if !oldInternalPolicyExists { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question - does NP store update need to be protected by the mutex too? Why?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, they need to be protected by the same mutex because NP store has index of AddressGroups and AppliedToGroups, which is used to determine whether an AddressGroup/AppliedToGroup is orphan or not. For example, the following code means the AppliedToGroup is no longer used and can be deleted:
objs, _ := n.internalNetworkPolicyStore.GetByIndex(store.AppliedToGroupIndex, atgName)
if len(objs) == 0 {
We must ensure the AddressGroup/AppliedToGroup data (stored in their own store) and their index (stored in NP store) are updated atomically, otherwise the first race condition the PR description describes could happen:
- addANP creates a group for ANP A;
- addNetworkPolicy reuses the group for KNP B, is going to create an internal NetworkPolicy;
- deleteANP deletes the group for ANP A because at that moment no other internal NetworkPolicies are using the group;
- addNetworkPolicy commits the internal NetworkPolicy for KNP B to storage, but the group no longer exists.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. Got it.
case controlplane.AntreaClusterNetworkPolicy: | ||
cnp, err := n.cnpLister.Get(key.Name) | ||
if err != nil { | ||
n.deleteInternalNetworkPolicy(internalNetworkPolicyName) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
deleteInternalNetworkPolicy() can delete groups too, but I did not see that is protected by the mutex?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's protected by the mutex:
antrea/pkg/controller/networkpolicy/networkpolicy_controller.go
Lines 1540 to 1542 in 9c9b098
func (n *NetworkPolicyController) deleteInternalNetworkPolicy(name string) { | |
n.internalNetworkPolicyMutex.Lock() | |
defer n.internalNetworkPolicyMutex.Unlock() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Humm.. not sure why I missed these lines.
There were a few race conditions in NetworkPolicyController: * An AppliedToGroup or AddressGroup in use may be removed if situations like this happens: 1. addANP creates a group for ANP A; 2. addNetworkPolicy reuses the group for KNP B, is going to create an internal NetworkPolicy; 3. deleteANP deletes the group for ANP A because at that moment no other internal NetworkPolicies are using the group; 4. addNetworkPolicy commits the internal NetworkPolicy for KNP B to storage, but the group no longer exists. * An Antrea-native NetworkPolicy may be out-of-date if situations like this happens: 1. An ACNP event is received, `updateCNP` calculates the new internal NetworkPolicy for the ACNP, is going to commit it to storage; 2. A ClusterGroup event triggers update of the ACNP via triggerCNPUpdates 3. triggerCNPUpdates calls reprocessCNP which updates the new internal NetworkPolicy for the ACNP and commits it to storage; 4. updateCNP in the first step commits its internal NetworkPolicy to storage which overrides the update of the ClusterGroup event. The second one caused test flake of the test case "TestGroupNoK8sNP/Case=ACNPNestedClusterGroup". To resolve the race conditions completely and make NetworkPolicy handling less error prone, this patch refactors NetworkPolicyController: * Event handlers no longer update the storage of internal NetworkPolicy directly and only triggers resync of affected policies, which ensures that there is at most one worker handling an internal NetworkPolicy at any moment. * Ensure atomicity when updating internal NetworkPolicy and creating or deleting AddressGroups and AppliedToGroups. Duplicate code and tests are deleted with the refactoring. Signed-off-by: Quan Tian <qtian@vmware.com>
There were a few race conditions in NetworkPolicyController:
like this happens:
internal NetworkPolicy;
internal NetworkPolicies are using the group;
storage, but the group no longer exists.
this happens:
updateCNP
calculates the new internalNetworkPolicy for the ACNP, is going to commit it to storage;
triggerCNPUpdates
NetworkPolicy for the ACNP and commits it to storage;
storage which overrides the update of the ClusterGroup event.
The second one caused test flake of the test case
"TestGroupNoK8sNP/Case=ACNPNestedClusterGroup".
To resolve the race conditions completely and make NetworkPolicy
handling less error prone, this patch refactors NetworkPolicyController:
directly and only triggers resync of affected policies, which ensures
that there is at most one worker handling an internal NetworkPolicy at
any moment.
deleting AddressGroups and AppliedToGroups.
Duplicate code and tests are deleted with the refactoring.
Signed-off-by: Quan Tian qtian@vmware.com
Fixes #4127