Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multicluster dataplane change for Service access #3603

Merged
merged 2 commits into from
Jun 9, 2022

Conversation

luolanzone
Copy link
Contributor

@luolanzone luolanzone commented Apr 8, 2022

This PR is on top of #3463, it includes two commits, one is for data path change, another one is for changing Multi-cluster Service's Endpoint from Pod IP to Service ClusterIP.

Commit 1:
Add Multi-cluster feature in Agent

  • Add a new feature gate Multicluster and configs in antrea-agent.conf, and a few extra items in antrea-agent
    cluster role including access to Gateway and ClusterInfoImport.
  • Rename the ServiceMarkTable to SNATMarkTable.
  • Add a controller for Gateway Nodes to watch Gateway and ClusterInfoImport's
    events. It will set up a few openflow rules to forward cross-cluster traffic to remote Gateway Nodes.
    • Add a classification rule for cross-cluster traffic with global multicluster virtual
      MAC aa:bb:cc:dd:ee:f0. A sample is like below:
     table=Classifier, priority=210,in_port="antrea-tun0",dl_dst=aa:bb:cc:dd:ee:f0 actions=load:0x1->NXM_NX_REG0[0..3],load:0x1->NXM_NX_REG0[9],resubmit(,SNATConntrackZone)
    
    • Add a rule in L3Forwarding table for cross-cluster request packets that modifies
      the destination MAC to global multicluster virtual MAC. A sample is like below (the destination CIDR
      is remote Service ClusterIP CIDR, the tunnel IP in NXM_NX_TUN_IPV4_DST is remote Gateway IP):
    table=L3Forwarding, priority=200,ip,nw_dst=10.96.0.0/12 actions=mod_dl_src:ee:73:a5:81:09:c6,mod_dl_dst:aa:bb:cc:dd:ee:f0,load:0xab01b39->NXM_NX_TUN_IPV4_DST[],load:0x1->NXM_NX_REG0[4..7],resubmit(,L3DecTTL)
    
    • Add a rule in L3Forwarding table for cross-cluster reply packets. A sample is like below
      (the destination IP is remote Gateway IP):
    table=L3Forwarding, priority=200,ct_state=+rpl+trk,ip,nw_dst=10.176.27.57 actions=mod_dl_src:ee:73:a5:81:09:c6,mod_dl_dst:aa:bb:cc:dd:ee:f0,load:0xab01b39->NXM_NX_TUN_IPV4_DST[],load:0x1->NXM_NX_REG0[4..7],resubmit(,L3DecTTL)
    
    • Add a rule to SNATMark table to match the packets of multi-cluster Service connection and perform DNAT in DNAT zone.
    table=SNATMark, priority=210,ip,nw_dst=10.96.0.0/12 actions=ct(commit,table=SNAT,zone=65520,exec(load:0x1->NXM_NX_CT_MARK[5]))
    
    • Add a rule to SNAT table to perform SNAT for any remote cluster traffic.
    table=SNAT, priority=200,ct_state=+new+trk,ip,nw_dst=10.96.0.0/12 actions=ct(commit,table=L2ForwardingCalc,zone=65521,nat(src=10.176.27.224),exec(load:0x1->NXM_NX_CT_MARK[4]))
    
    • Add a rule to UnSNAT table to perform de-SNAT if destination IP is local GatewayIP.
    table=UnSNAT, priority=200,ip,nw_dst=10.176.27.224 actions=ct(table=ConntrackZone,zone=65521,nat)
    
    • Add a rule in L2ForwardingCalc table to load the global virtual multi-cluster MAC's output to antrea-tun0
    table=L2ForwardingCalc, priority=200,dl_dst=aa:bb:cc:dd:ee:f0 actions=load:0x1->NXM_NX_REG1[],load:0x1->NXM_NX_REG0[8],resubmit(,22)
    
    • Add a rule in Output table to match the multi-cluster traffic to forward the traffic from/to regular Node
      through the same port.
    table=Output, priority=210,reg0=0x200/0x200,reg1=0x1,in_port=1 actions=IN_PORT
    
  • Add a controller for regular Nodes to watch Gateway and ClusterInfoImport's events.
    It will set up a few openflow rules to forward cross-cluster traffic to local Gateway Node.
    • Add a rule in L3Forwarding table for cross-cluster request packets, and
      modify the destination MAC to global multicluster virtual MAC. A sample is like below
      (the destination CIDR is remote Service ClusterIP CIDR, the tunnel IP in NXM_NX_TUN_IPV4_DST is local Gateway's Internal IP.):
    table=L3Forwarding, priority=200,ip,nw_dst=10.96.0.0/12 actions=mod_dl_src:f2:08:93:0c:82:bd,mod_dl_dst:aa:bb:cc:dd:ee:ff,load:0xab0193b->NXM_NX_TUN_IPV4_DST[],load:0x1->NXM_NX_REG0[4..7],resubmit(,L3DecTTL)
    
    • Add a rule in L3Forwarding table for cross-cluster reply packets. A sample is like below
      (the destination IP is remote Gateway IP, the tunnel IP in NXM_NX_TUN_IPV4_DST is local Gateway's Internal IP):
    table=L3Forwarding, priority=200,ct_state=+rpl+trk,ip,nw_dst=10.176.27.57 actions=mod_dl_src:f2:08:93:0c:82:bd,mod_dl_dst:aa:bb:cc:dd:ee:f0,load:0xab0193b->NXM_NX_TUN_IPV4_DST[],load:0x1->NXM_NX_REG0[4..7],resubmit(,L3DecTTL)
    
    • Add a rule in L2ForwardingCalc table to load the global virtual multi-cluster MAC's output to antrea-tun0
    table=L2ForwardingCalc, priority=200,dl_dst=aa:bb:cc:dd:ee:f0 actions=load:0x1->NXM_NX_REG1[],load:0x1->NXM_NX_REG0[8],resubmit(,22)
    
  • Add unit test cases
  • Refine e2e test for data plane change

Signed-off-by: Lan Luo luola@vmware.com
Co-authored-by: Hongliang Liu lhongliang@vmware.com

Commit 2:
Use Service ClusterIPs as MC Service's Endpoints

  1. Use Service ClusterIPs instead of Pod IPs as MC Service's Endpoints.
    The ServiceExport controller will only watch ServiceExport and
    Service events, and wrap Service's ClusterIPs into a new Endpoint kind of
    ResourceExport.
  2. Includes local Serivce ClusterIP as multi-cluster Service's Endpoints as well.

Signed-off-by: Lan Luo luola@vmware.com

@luolanzone luolanzone added the area/multi-cluster Issues or PRs related to multi cluster. label Apr 8, 2022
@luolanzone luolanzone marked this pull request as draft April 8, 2022 06:34
@codecov-commenter
Copy link

codecov-commenter commented Apr 8, 2022

Codecov Report

Merging #3603 (f77ebbd) into main (90419f2) will decrease coverage by 7.16%.
The diff coverage is 51.59%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #3603      +/-   ##
==========================================
- Coverage   63.96%   56.79%   -7.17%     
==========================================
  Files         288      406     +118     
  Lines       41252    57574   +16322     
==========================================
+ Hits        26386    32699    +6313     
- Misses      12733    22222    +9489     
- Partials     2133     2653     +520     
Flag Coverage Δ
integration-tests 37.74% <ø> (?)
kind-e2e-tests 50.86% <3.31%> (+0.05%) ⬆️
unit-tests 44.37% <49.04%> (+0.05%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
pkg/agent/openflow/cookie/allocator.go 76.74% <0.00%> (-3.75%) ⬇️
pkg/apiserver/handlers/featuregates/handler.go 75.40% <ø> (ø)
pkg/features/antrea_features.go 11.11% <ø> (ø)
pkg/ovs/openflow/ofctrl_action.go 69.45% <ø> (ø)
pkg/ovs/openflow/ofctrl_group.go 50.76% <ø> (ø)
pkg/util/k8s/client.go 40.00% <27.27%> (-1.08%) ⬇️
pkg/agent/openflow/client.go 66.14% <40.42%> (-1.73%) ⬇️
pkg/agent/openflow/multicluster.go 42.99% <42.99%> (ø)
pkg/agent/multicluster/mc_route_controller.go 52.27% <52.27%> (ø)
...lticluster/commonarea/resourceimport_controller.go 68.90% <66.66%> (-0.46%) ⬇️
... and 133 more

@luolanzone
Copy link
Contributor Author

/test-multicluster-dataplane-e2e

5 similar comments
@luolanzone
Copy link
Contributor Author

/test-multicluster-dataplane-e2e

@luolanzone
Copy link
Contributor Author

/test-multicluster-dataplane-e2e

@luolanzone
Copy link
Contributor Author

/test-multicluster-dataplane-e2e

@luolanzone
Copy link
Contributor Author

/test-multicluster-dataplane-e2e

@luolanzone
Copy link
Contributor Author

/test-multicluster-dataplane-e2e

@luolanzone luolanzone force-pushed the mc-dataplane branch 2 times, most recently from f7ac8fd to 1e29016 Compare April 26, 2022 09:00
@luolanzone
Copy link
Contributor Author

/test-multicluster-dataplane-e2e

@luolanzone
Copy link
Contributor Author

/test-multicluster-dataplane-e2e

@luolanzone luolanzone changed the title [WIP]Multicluster dataplane change for Service and Pod access [WIP]Multicluster dataplane change for Service access Apr 27, 2022
@luolanzone
Copy link
Contributor Author

/test-multicluster-dataplane-e2e

1 similar comment
@luolanzone
Copy link
Contributor Author

/test-multicluster-dataplane-e2e

@luolanzone luolanzone force-pushed the mc-dataplane branch 2 times, most recently from 079c4e7 to 7542ec3 Compare April 27, 2022 08:07
@luolanzone
Copy link
Contributor Author

/test-multicluster-dataplane-e2e

@luolanzone
Copy link
Contributor Author

/test-multicluster-dataplane-e2e

@luolanzone luolanzone force-pushed the mc-dataplane branch 2 times, most recently from 552846a to be8750e Compare April 27, 2022 10:06
@luolanzone
Copy link
Contributor Author

/test-multicluster-dataplane-e2e

@luolanzone
Copy link
Contributor Author

/test-multicluster-dataplane-e2e

Copy link
Member

@tnqn tnqn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM except one question

pkg/agent/openflow/multicluster.go Show resolved Hide resolved
@luolanzone
Copy link
Contributor Author

/test-multicluster-dataplane-e2e

@jianjuns
Copy link
Contributor

jianjuns commented Jun 8, 2022

/test-all

tnqn
tnqn previously approved these changes Jun 8, 2022
Copy link
Member

@tnqn tnqn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jianjuns
Copy link
Contributor

jianjuns commented Jun 8, 2022

/test-integration
/test-multicluster-e2e
/test-multicluster-dataplane-e2e
/test-ipv6-only-e2e

@jianjuns
Copy link
Contributor

jianjuns commented Jun 9, 2022

/test-integration

@luolanzone
Copy link
Contributor Author

@jianjuns smee service is broken, the jenkins doesn't respond to the comment now, I will check with @XinShuYang.

@luolanzone
Copy link
Contributor Author

Integration test failed due to the table rename, I will fix this.

@luolanzone
Copy link
Contributor Author

/test-integration
/test-multicluster-e2e
/test-multicluster-dataplane-e2e
/test-ipv6-only-e2e

@luolanzone
Copy link
Contributor Author

/test-conformance
/test-e2e
/test-networkpolicy

luolanzone and others added 2 commits June 9, 2022 10:53
* Add a new feature gate `Multicluster` and configs in antrea-agent.conf, and a few extra items in antrea-agent
  cluster role including access to `Gateway` and `ClusterInfoImport`.
* Rename the `ServiceMarkTable` to `SNATMarkTable`.
* Add a controller for Gateway Nodes to watch Gateway and ClusterInfoImport's
events. It will set up a few openflow rules to forward cross-cluster traffic to remote Gateway Nodes.
    * Add a classification rule for cross-cluster traffic with global multicluster virtual
    MAC `aa:bb:cc:dd:ee:f0`. A sample is like below:
    ```
     table=Classifier, priority=210,in_port="antrea-tun0",dl_dst=aa:bb:cc:dd:ee:f0 actions=load:0x1->NXM_NX_REG0[0..3],load:0x1->NXM_NX_REG0[9],resubmit(,SNATConntrackZone)
    ```
    * Add a rule in `L3Forwarding` table for cross-cluster request packets that modifies
    the destination MAC to global multicluster virtual MAC. A sample is like below (the destination CIDR
    is remote Service ClusterIP CIDR, the tunnel IP in NXM_NX_TUN_IPV4_DST is remote Gateway IP):
    ```
    table=L3Forwarding, priority=200,ip,nw_dst=10.96.0.0/12 actions=mod_dl_src:ee:73:a5:81:09:c6,mod_dl_dst:aa:bb:cc:dd:ee:f0,load:0xab01b39->NXM_NX_TUN_IPV4_DST[],load:0x1->NXM_NX_REG0[4..7],resubmit(,L3DecTTL)
    ```
    * Add a rule in `L3Forwarding` table for cross-cluster reply packets. A sample is like below
      (the destination IP is remote Gateway IP):
    ```
    table=L3Forwarding, priority=200,ct_state=+rpl+trk,ip,nw_dst=10.176.27.57 actions=mod_dl_src:ee:73:a5:81:09:c6,mod_dl_dst:aa:bb:cc:dd:ee:f0,load:0xab01b39->NXM_NX_TUN_IPV4_DST[],load:0x1->NXM_NX_REG0[4..7],resubmit(,L3DecTTL)
    ```
    * Add a rule to `SNATMark` table to match the packets of multi-cluster Service connection and perform DNAT in DNAT zone.
    ```
    table=SNATMark, priority=210,ip,nw_dst=10.96.0.0/12 actions=ct(commit,table=SNAT,zone=65520,exec(load:0x1->NXM_NX_CT_MARK[5]))
    ```
    * Add a rule to `SNAT` table to perform SNAT for any remote cluster traffic.
    ```
    table=SNAT, priority=200,ct_state=+new+trk,ip,nw_dst=10.96.0.0/12 actions=ct(commit,table=L2ForwardingCalc,zone=65521,nat(src=10.176.27.224))
    ```
    * Add a rule to `UnSNAT` table to perform de-SNAT if destination IP is local GatewayIP.
    ```
    table=UnSNAT, priority=200,ip,nw_dst=10.176.27.224 actions=ct(table=ConntrackZone,zone=65521,nat)
    ```
    * Add a rule in `L2ForwardingCalc` table to load the global virtual multi-cluster MAC's output to `antrea-tun0`
    ```
    table=L2ForwardingCalc, priority=200,dl_dst=aa:bb:cc:dd:ee:f0 actions=load:0x1->NXM_NX_REG1[],load:0x1->NXM_NX_REG0[8],resubmit(,22)
    ```
    * Add a rule in `Output` table to match the multi-cluster traffic to forward the traffic from/to regular Node
    through the same port.
    ```
    table=Output, priority=210,reg1=0x1,in_port=1 actions=IN_PORT
    ```
* Add a controller for regular Nodes to watch Gateway and ClusterInfoImport's events.
   It will set up a few openflow rules to forward cross-cluster traffic to local Gateway Node.
    * Add a rule in L3Forwarding table for cross-cluster request packets, and
    modify the destination MAC to global multicluster virtual MAC. A sample is like below
    (the destination CIDR is remote Service ClusterIP CIDR, the tunnel IP in NXM_NX_TUN_IPV4_DST is local Gateway's Internal IP.):
    ```
    table=L3Forwarding, priority=200,ip,nw_dst=10.96.0.0/12 actions=mod_dl_src:f2:08:93:0c:82:bd,mod_dl_dst:aa:bb:cc:dd:ee:ff,load:0xab0193b->NXM_NX_TUN_IPV4_DST[],load:0x1->NXM_NX_REG0[4..7],resubmit(,L3DecTTL)
    ```
    * Add a rule in L3Forwarding table for cross-cluster reply packets. A sample is like below
    (the destination IP is remote Gateway IP, the tunnel IP in NXM_NX_TUN_IPV4_DST is local Gateway's Internal IP):
    ```
    table=L3Forwarding, priority=200,ct_state=+rpl+trk,ip,nw_dst=10.176.27.57 actions=mod_dl_src:f2:08:93:0c:82:bd,mod_dl_dst:aa:bb:cc:dd:ee:f0,load:0xab0193b->NXM_NX_TUN_IPV4_DST[],load:0x1->NXM_NX_REG0[4..7],resubmit(,L3DecTTL)
    ```
    * Add a rule in L2ForwardingCalc table to load the global virtual multi-cluster MAC's output to `antrea-tun0`
    ```
    table=L2ForwardingCalc, priority=200,dl_dst=aa:bb:cc:dd:ee:f0 actions=load:0x1->NXM_NX_REG1[],load:0x1->NXM_NX_REG0[8],resubmit(,22)
    ```
* Add unit test cases
* Refine e2e test for data plane change

Signed-off-by: Lan Luo <luola@vmware.com>
Co-authored-by: Hongliang Liu <lhongliang@vmware.com>
1. Use Service ClusterIPs instead of Pod IPs as MC Service's Endpoints.
The ServiceExport controller will only watch ServiceExport and
Service events, and wrap Service's ClusterIPs into a new Endpoint kind of
ResourceExport.
2. Includes local Serivce ClusterIP as multi-cluster Service's Endpoints as well.

Signed-off-by: Lan Luo <luola@vmware.com>
@luolanzone
Copy link
Contributor Author

/test-integration

@jianjuns
Copy link
Contributor

jianjuns commented Jun 9, 2022

Integration test failed due to the table rename, I will fix this.

Sure. Please take care of such issues earlier next time.

@luolanzone
Copy link
Contributor Author

Integration test failed due to the table rename, I will fix this.

Sure. Please take care of such issues earlier next time.

Yeah, I will, I didn't notice integration test will be impacted, will check the result in time next time.
btw, do we need to rerun all tests again? or just those failed cases?

@jianjuns
Copy link
Contributor

jianjuns commented Jun 9, 2022

Ideally run all tests of test-all.

@luolanzone
Copy link
Contributor Author

/test-all

1 similar comment
@luolanzone
Copy link
Contributor Author

/test-all

@luolanzone
Copy link
Contributor Author

/test-multicluster-dataplane-e2e

@luolanzone
Copy link
Contributor Author

Hi @tnqn all required tests are passed now, I skipped test-multicluster-e2e which is an old job. The new one jenkins-multicluster-dataplane-e2e is passed. Could you help to move forward? thanks.

@tnqn
Copy link
Member

tnqn commented Jun 9, 2022

Since this change updates common flows, could you make sure ipv6-e2e and ipv6-only-e2e pass (except FQDNPolicyInCluster in ipv6-only-e2e)? We had issues with them several times and just fixed all of them except #3873, don't want to break them again when it's close to release.

@luolanzone
Copy link
Contributor Author

/test-ipv6-only-e2e
/test-ipv6-e2e

@tnqn tnqn merged commit 968b330 into antrea-io:main Jun 9, 2022
@luolanzone luolanzone mentioned this pull request Jun 15, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/multi-cluster Issues or PRs related to multi cluster.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants