Skip to content

Commit

Permalink
Update MC architecture doc
Browse files Browse the repository at this point in the history
Update MC architecture doc and change old diagrams' format from
png to svg.

Signed-off-by: Lan Luo <luola@vmware.com>
  • Loading branch information
luolanzone committed Jun 13, 2022
1 parent ea99ee5 commit b13e13f
Show file tree
Hide file tree
Showing 6 changed files with 2,328 additions and 36 deletions.
125 changes: 89 additions & 36 deletions docs/multicluster/architecture.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,13 +6,15 @@ ClusterSet. Antrea Multi-cluster also supports Antrea ClusterNetworkPolicy repli
Multi-cluster admins can define ClusterNetworkPolicies to be replicated across the entire
ClusterSet, and enforced in all member clusters.

The diagram below depicts a basic multi-cluster topology in Antrea.
An Antrea Multi-cluster ClusterSet includes a leader cluster and multiple member clusters.
Antrea Multi-cluster Controller needs to be deployed in the leader and all member
clusters. A cluster can serve as the leader, and meanwhile also be a member cluster of the
ClusterSet.

<img src="assets/basic-topology.png" width="500" alt="Antrea Multi-cluster Topology">
The diagram below depicts a basic Antrea Multi-cluster topology with one leader cluster
and two member clusters.

Given a set of Kubernetes clusters, there will be a leader cluster and several member clusters.
By default, a leader cluster itself is also a member cluster of a ClusterSet. A cluster
can also be configured as a dedicated leader cluster of multiple ClusterSets.
<img src="assets/basic-topology.svg" width="650" alt="Antrea Multi-cluster Topology">

## Terminology

Expand All @@ -22,8 +24,8 @@ Namespace sameness applies, which means all Namespaces with a given name are con
be the same Namespace. The ClusterSet Custom Resource Definition(CRD) defines a ClusterSet
including the leader and member clusters information.

The ClusterClaim CRD is used to claim a cluster itself as a member of a ClusterSet with
unique cluster ID, to claim a ClusterSet with unique ClusterSet ID.
The ClusterClaim CRD is used to claim a ClusterSet with a unique ClusterSet ID, and to
claim the cluster itself as a member of a ClusterSet with a unique cluster ID.

The MemberClusterAnnounce CRD declares a member cluster configuration to the leader cluster.

Expand All @@ -34,33 +36,40 @@ given ClusterSet.

## Antrea Multi-cluster Controller

In a member cluster, Antrea Multi-cluster creates a Deployment that runs Antrea Multi-cluster
Controller which is responsible for exporting resource to and importing resource from a leader
cluster in a ClusterSet.
Antrea Multi-cluster Controller watches ClusterSet, ClutserClaim, Service, ServiceExport etc.
It has different responsibilities in member and leader clusters.

In a leader cluster, Antrea Multi-cluster creates a Deployment that runs Antrea Multi-cluster
Controller which is responsible for converting resources from different member clusters into one
encapsulated resource as long as these resources have the same kind and match Namespace sameness.
In a member cluster, Antrea Multi-cluster creates a Deployment with one replica that runs
Antrea Multi-cluster Controller which is responsible for exporting resource to and importing
resource from a leader cluster in a ClusterSet.

In ClusterSet initialization, Antrea Multi-cluster Controller in a member cluster watches
In a leader cluster, Antrea Multi-cluster creates a Deployment with one replica that runs
Antrea Multi-cluster Controller which is responsible for converting resources from different
member clusters into one encapsulated resource as long as these resources have the same kind
and match Namespace sameness.

When ClusterSet is initialized in member cluster, Antrea Multi-cluster Controller watches
ClusterSet, ClusterClaim, and creates a MemberClusterAnnounce in the leader cluster.

For exporting resources from the member cluster, it watches ServiceExport, Service and Endpoints
resources, encapsulates Services and Endpoints into ResourceExports according to ServiceExports,
and writes the ResourceExports to leader cluster. For resource importing, it watches ResourceImports
from leader cluster, and creates multi-cluster Services and Endpoints with a prefix `antrea-mc-`
plus exported Service name, and also ServiceImports which have the same name as the Service name.
For resources exporting in the member cluster, Antrea Multi-cluster Controller watches ServiceExport,
Service and Endpoints resources, encapsulates Services and Endpoints into ResourceExports
according to ServiceExports, and writes the ResourceExports to leader cluster.

For resource importing in the member cluster, Antrea Multi-cluster Controller watches ResourceImports
from leader cluster, and creates multi-cluster Services and Endpoints with a prefix `antrea-mc-` plus
exported Service name, and also ServiceImports which have the same name as the Service name.

In a leader cluster, for ClusterSet initialization, Antrea Multi-cluster controller watches and
When ClusterSet is initialized in leader cluster, Antrea Multi-cluster controller watches and
validates the ClusterSet and Clusterclaim. For resource export/import, it watches ResourceExports
and encapsulates them into ResourceImports.

## Service Export and Import
### Service Export and Import

<img src="assets/resource-export-import-pipeline.png" width="800" alt="Antrea Multi-cluster Resource Export/Import Pipeline">
<img src="assets/resource-export-import-pipeline.svg" width="1500" alt="Antrea Multi-cluster Resource Export/Import Pipeline">

The current multi-cluster implementation supports Service discovery and Service export/import among
member clusters. The above diagram depicts Antrea Multi-cluster resource export/import pipeline.
The current multi-cluster implementation supports Service discovery and resources export/import among
member clusters. The above diagram depicts Antrea Multi-cluster resource export/import pipeline, It
uses Service as an example.

Given two Services in the member clusters - `foo.ns.cluster1.local` and `foo.ns.cluster2.local`,
multi-cluster Services may be generated by the following the resource export/import pipeline.
Expand All @@ -80,18 +89,62 @@ Service `cluster1-ns-foo-service`, `cluster2-ns-foo-service` and associated Endp
ServiceImport `ns/foo` locally if they don't exist or updates them if the resources have already
been created by the Importer earlier.

## Antrea Multi-cluster Service

Antrea Multi-cluster Controller only supports Service of type ClusterIP at this moment. In
order to support multi-cluster Service access between member clusters, Antrea requires member
clusters' Pod IPs are reachable and no overlapping between all member clusters.

When Antrea Multi-cluster Controller in member cluster watches ResourceImport's creation event
in leader cluster, it will create multi-cluster Service, Endpoints and ServiceImport locally.
The Service Ports definition will be the same as exported Services, the Endpoints will be Pod
IPs from all member clusters. The new created Antrea Multi-cluster Service is just like a regular
Kubernetes Service, so Pods in a member cluster can access the multi-cluster Service as usual without
any extra setting.
### Service Access Across Clusters

Since Antrea v1.7.0, Multi-cluster Gateways must be configured to support Multi-cluster Service
access across member clusters, and Service CIDRs cannot overlap between clusters. Refer to
[Antrea Multi-cluster Gateway](#antrea-multi-cluster-gateway) for more information. Before Antrea
v1.7.0, Pod IPs must be directly reachable across clusters for Multi-cluster Service access, and
Pod CIDRs cannot overlap between clusters. Antrea Multi-cluster only supports Service of type
ClusterIP at this moment.

### Antrea Multi-cluster Gateway

Antrea started to support Multi-cluster Gateway since v1.7.0. User can choose one of K8s Nodes
as the Gateway in a member cluster. The assigned Gateway Node will be responsible to forward all
cross-clusters traffic from local cluster to other member clusters. Below is the basic architecture for
Antrea Multi-cluster connectivity with Gateway.

<img src="assets/mc-gateway.svg" width="800" alt="Antrea Multi-cluster Gateway">

Antrea Agent is responsible to set up tunnels between member clusters' Gateways. At the moment,
Multi-cluster Service connectivity only works with `encap` mode. It uses the Geneve tunnel type by
default. Each member cluster needs to deploy Antrea with the same tunnel type in a ClusterSet.

To support Service connectivity in a ClusterSet, Antrea Multi-cluster introduces two new CRDs `Gateway` and
`ClusterInfoImport`, which help to exchange member cluster's basic network information. It relys on existing
resource export/import pipeline. Since v1.7.0, the Endpoints of multi-cluster Service will be other member
clusters' exported Service's Cluster IP. Please refer to [Antrea Multi-cluster User Guide](user-guide.md)
to learn more details about how to deploy and use it. Note that it supports IPv4 only and does not
support Service Cluster IP overlapping.

### Antrea Multi-cluster Service Traffic Walk

Let's use above diagram as a sample ClusterSet, below is the basic information of the ClusterSet
including three member clusters in our example.

1. Cluster A has a client Pod named `Pod-A` running in a regular Node, and a multi-cluster Service
named `antrea-mc-nginx` with Cluster IP `10.112.10.11` in `default` Namespace.
2. Cluster B exported Service named `nginx` with Cluster IP `10.96.2.22` in `default` Namespace. The Service
has one Endpoint `17.170.11.22` which is `Pod-B`'s IP.
3. Cluster C exported Service named `nginx` with Cluster IP `10.11.12.33` in `default` Namespace. The Service
has one Endpoint `172.10.11.33` which is `Pod-C`'s IP.

The multi-cluster Service `antrea-mc-nginx` in cluster A will have two corresponding Endpoints:

* `nginx` Service's Cluster IP `10.96.2.22` from cluster B.
* `nginx` Service's Cluster IP `10.11.12.33` from cluster C.

When the client Pod `Pod-A` on cluster A is trying to access the multi-cluster Service `antrea-mc-nginx`,
Antrea Agent on the regular Node `node-a2` will choose one Endpoint as the destination IP randomly and perform
Endpoint DNAT. Let's say Antrea Agent choose the Endpoint `10.11.12.33` from cluster C as the destination, the
request packets will be encapsulated and sent to Gateway Node (`node-a1`) in Cluster A first, then the
Antrea Agent on `node-a1` will perfrom SNAT to set the source IP as its own Gateway IP, encapsulate it
and send it through `antrea-tun0` tunnel interface to the target Gateway Node (`node-c1`) in Cluster C.
When the request packets go through OVS in `node-c1`, Antrea Agent on `node-c1` will perfrom Endpoint DNAT
again to choose an Endpoint which is a Pod IP as the destination. Since there is only one `Pod-C` running
as `nginx` Service's workload in Cluster C, the final destination IP will be `172.10.11.33`, the request
packets will eventually go to `Pod-C` on the Node `node-c2`.

## Antrea Multi-cluster NetworkPolicy

Expand Down
Binary file removed docs/multicluster/assets/basic-topology.png
Binary file not shown.
Loading

0 comments on commit b13e13f

Please sign in to comment.