Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ExternalNode] Add feature gate to run Agent on a non-Kubernetes Node #3542

Merged
merged 1 commit into from
Apr 26, 2022

Conversation

wenyingd
Copy link
Contributor

@wenyingd wenyingd commented Mar 30, 2022

  1. Use feature gate "ExternalNode" to support Antrea Agent running
    on a non-Kubernetes Node.
  2. Support Agent running APIServer only on localhost if it is not
    running on cluster worker Node.
  3. CNIServer is loaded only when Agent is running on cluster worker
    Node.

Signed-off-by: wenyingd wenyingd@vmware.com

@wenyingd wenyingd requested review from tnqn, jianjuns and mengdie-song and removed request for tnqn March 30, 2022 03:03
@wenyingd
Copy link
Contributor Author

/test-all
/test-windows-all
/test-ipv6-all
/test-ipv6-only-all

@codecov-commenter
Copy link

codecov-commenter commented Mar 30, 2022

Codecov Report

Merging #3542 (aacc0f4) into feature/externalnode (3418e20) will decrease coverage by 7.91%.
The diff coverage is 56.19%.

Impacted file tree graph

@@                   Coverage Diff                    @@
##           feature/externalnode    #3542      +/-   ##
========================================================
- Coverage                 63.54%   55.62%   -7.92%     
========================================================
  Files                       278      392     +114     
  Lines                     39502    55133   +15631     
========================================================
+ Hits                      25101    30669    +5568     
- Misses                    12466    22089    +9623     
- Partials                   1935     2375     +440     
Flag Coverage Δ
integration-tests 38.25% <ø> (?)
kind-e2e-tests 49.02% <55.23%> (-1.69%) ⬇️
unit-tests 43.82% <30.52%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
pkg/agent/openflow/pipeline.go 72.72% <ø> (-1.78%) ⬇️
pkg/features/antrea_features.go 8.00% <0.00%> (-3.12%) ⬇️
pkg/agent/agent.go 53.60% <35.18%> (-1.33%) ⬇️
pkg/util/env/env.go 42.85% <57.14%> (+1.87%) ⬆️
pkg/agent/config/node_config.go 96.00% <85.71%> (-4.00%) ⬇️
pkg/agent/apiserver/apiserver.go 66.30% <100.00%> (ø)
pkg/agent/openflow/client.go 72.31% <100.00%> (+0.13%) ⬆️
pkg/controller/egress/store/egressgroup.go 1.72% <0.00%> (-54.32%) ⬇️
pkg/agent/util/net.go 29.33% <0.00%> (-22.23%) ⬇️
pkg/agent/route/route_linux.go 28.59% <0.00%> (-19.73%) ⬇️
... and 133 more

@wenyingd
Copy link
Contributor Author

/test-e2e
/test-networkpolicy
/test-windows-networkpolicy

Copy link
Contributor

@jianjuns jianjuns left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have a way to add a feature gate to isolate VM/BM code?

@@ -2975,6 +2975,12 @@ data:
# Note that setting ProxyLoadBalancerIPs to false usually only makes sense when ProxyAll is set to true and
# kube-proxy is removed from the cluser, otherwise kube-proxy will still load-balance this traffic.
#proxyLoadBalancerIPs: true

# The role in which Antrea Agent is working. Supported values:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, can we generate a separate yaml for VM/BM, to avoid these confusing parameters for normal K8s?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I create a sperate build directory for VM/BM configuration, but I didn't modify the create generate-manifest script to generate yaml file because for VM/BM case antrea-agent.conf is used directly on agent side and we don't need a all-in-one yaml file to apply on K8s cluster. Besides, In @mengdie-song 's PR, she will create a separate CRD yaml file for VM agent which could be applied on the cluster, and I think that file should also place in the new configration directory.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just mean could we find a way to avoid changing the regular yamls and adding VM only contents.


# The role in which Antrea Agent is working. Supported values:
# - Node (default)
# - VM
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any difference between VM and BM and so we need two types? If not or not clear, let us just add one type called "VirtualMachine" for now.

@wenyingd wenyingd force-pushed the vmbm_agent_role branch 5 times, most recently from 92a71f8 to be15219 Compare March 31, 2022 11:44
@jianjuns
Copy link
Contributor

Any thoughts on this one?

Do we have a way to add a feature gate to isolate VM/BM code?

@wenyingd wenyingd force-pushed the vmbm_agent_role branch 2 times, most recently from 2a7178d to e826128 Compare April 1, 2022 16:00
@@ -0,0 +1,39 @@
# FeatureGates is a map of feature names to bools that enable or disable experimental features.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably change "externalnode" in the path to "vmnode".

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, the problem with "externalnode" is that it can refer to a node/policy receiver without Agent running.

# FeatureGates is a map of feature names to bools that enable or disable experimental features.
featureGates:
# Enable running agent on a Virtual Machine.
ExternalNode: true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably call it "VirtualMachineNode".

if o.config.TunnelType == "" {
o.config.TunnelType = defaultTunnelType
}
if o.config.ServiceCIDR == "" {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we still need this for VM? Why?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, this is to avoid agent crash on line 140 in cmd/antrea-agent/agent.go. o.config.ServiceCIDR is used to parse the serviceCIDR directly. Maybe I should add feature gate check in that logic.

@@ -59,6 +59,13 @@ var (
VirtualServiceIPv6 = net.ParseIP("fc01::aabb:ccdd:eeff")
)

type NodeType string
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we often compare the values, probably use enum.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With feature gate, this is not ofen used in value comparasion. I want to keep string value because I planed to show the Node type in AntreaAgentInfo in future.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

feature gate is not used in "inner" pkgs. Always converted to a boolean.

I mean if we compare the values in any frequent call path (e.g. when handling a new change from API or CNI), I think enum is better. If it is just used in bootstrapping, string is fine. Even for enum, you can add a String() func.

@wenyingd wenyingd force-pushed the feature/externalnode branch from e34639d to 03b3f2b Compare April 7, 2022 11:15

func (o *Options) validateVMOptions() error {
var unsupported []string
for f, enabled := range o.config.FeatureGates {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There can be features not enabled by feature gate (e.g. encapMode, enableNodePortLocal whose feature gate is enabled by default). Could you check any such features should be validated here too?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume you check all the options and no other unsupported options?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made two changes:

  1. Add unsupported options check with EnableIPSecTunnel, although it is deprecated, the encryptionMode is possibly changed to TrafficEncryptionModeIPSec if it is set as true
  2. Set encryptionMode default value as TrafficEncryptionModeNone in Options.run function, and load value from the configuration only when encapMode requires supporting Encap ( the ToT code actually has a useless configuration parsing with noEncap mode ).

@@ -59,6 +59,13 @@ var (
VirtualServiceIPv6 = net.ParseIP("fc01::aabb:ccdd:eeff")
)

type NodeType string
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

feature gate is not used in "inner" pkgs. Always converted to a boolean.

I mean if we compare the values in any frequent call path (e.g. when handling a new change from API or CNI), I think enum is better. If it is just used in bootstrapping, string is fine. Even for enum, you can add a String() func.

NodeNameEnvKey = "NODE_NAME"
// ExternalNodeEnvKey is an environment variable used in ExternalNode feature to tell the expected name with which
// antrea agent is running. This is also used in the shell where antctl is running on a VM host.
ExternalNodeEnvKey = "EXTERNAL_NODE"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it just a name string? If so, why not reuse "NODE_NAME"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is mainly for antctl support. I need a different env variable to tell antctl it is running on a VM but not a K8s Node.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we assume antctl always runs in a container? If not, the env will not be set anyway right?

I feel better to use another env variable to indicate node type for antctl, and also add a parameter for antctl for the case it runs without an env.

if nodeName != "" {
return
}
klog.Infof("Environment variable %s not found, using hostname instead", ExternalNodeEnvKey)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better to print NodeNameEnvKey?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NodeNameEnvKey is printed in line 56, maybe I should remove line 56 and change to NodeNameEnvKey in this line.

@wenyingd
Copy link
Contributor Author

wenyingd commented Apr 8, 2022

/test-all
/test-ipv6-all
/test-ipv6-only-all
/test-windows-all

@wenyingd
Copy link
Contributor Author

/test-windows-e2e
/test-windows-conformance

@tnqn
Copy link
Member

tnqn commented Apr 26, 2022

/test-conformance

# The path to access the kubeconfig file used in the connection to Antrea Controller. The file contains the
# antrea-controller APIServer endpoint and the token of ServiceAccount required in the connection.
antreaClientConnection:
kubeconfig: antrea-agent.antrea.kubeconfig
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wenyingd should we use helm to generate this file based on template? Otherwise we would need to update this file manually whenever there is a change to related parameters? I don't mean it needs to be done in this PR.

@tnqn tnqn merged commit 3f35235 into antrea-io:feature/externalnode Apr 26, 2022
@wenyingd wenyingd deleted the vmbm_agent_role branch April 28, 2022 06:15
wenyingd added a commit to wenyingd/antrea that referenced this pull request Apr 29, 2022
…3542)

1. Add feature gate for ExternalNode which enables running Agent on a
   VM or BM.
2. Support Agent running APIServer only on localhost if it is not
   running on cluster worker Node.
3. CNIServer is loaded only when Agent is running on cluster worker
   Node.
4. Use a seperate build directory to generate agent configrations for
   ExternalNode.

Signed-off-by: wenyingd <wenyingd@vmware.com>
wenyingd added a commit to wenyingd/antrea that referenced this pull request May 7, 2022
…3542)

1. Add feature gate for ExternalNode which enables running Agent on a
   VM or BM.
2. Support Agent running APIServer only on localhost if it is not
   running on cluster worker Node.
3. CNIServer is loaded only when Agent is running on cluster worker
   Node.
4. Use a seperate build directory to generate agent configrations for
   ExternalNode.

Signed-off-by: wenyingd <wenyingd@vmware.com>
wenyingd added a commit to wenyingd/antrea that referenced this pull request May 20, 2022
…3542)

1. Add feature gate for ExternalNode which enables running Agent on a
   VM or BM.
2. Support Agent running APIServer only on localhost if it is not
   running on cluster worker Node.
3. CNIServer is loaded only when Agent is running on cluster worker
   Node.
4. Use a seperate build directory to generate agent configrations for
   ExternalNode.

Signed-off-by: wenyingd <wenyingd@vmware.com>
wenyingd added a commit to wenyingd/antrea that referenced this pull request May 20, 2022
…3542)

1. Add feature gate for ExternalNode which enables running Agent on a
   VM or BM.
2. Support Agent running APIServer only on localhost if it is not
   running on cluster worker Node.
3. CNIServer is loaded only when Agent is running on cluster worker
   Node.
4. Use a seperate build directory to generate agent configrations for
   ExternalNode.

Signed-off-by: wenyingd <wenyingd@vmware.com>
wenyingd added a commit to wenyingd/antrea that referenced this pull request May 26, 2022
…3542)

1. Add feature gate for ExternalNode which enables running Agent on a
   VM or BM.
2. Support Agent running APIServer only on localhost if it is not
   running on cluster worker Node.
3. CNIServer is loaded only when Agent is running on cluster worker
   Node.
4. Use a seperate build directory to generate agent configrations for
   ExternalNode.

Signed-off-by: wenyingd <wenyingd@vmware.com>
wenyingd added a commit to wenyingd/antrea that referenced this pull request May 27, 2022
…3542)

1. Add feature gate for ExternalNode which enables running Agent on a
   VM or BM.
2. Support Agent running APIServer only on localhost if it is not
   running on cluster worker Node.
3. CNIServer is loaded only when Agent is running on cluster worker
   Node.
4. Use a seperate build directory to generate agent configrations for
   ExternalNode.

Signed-off-by: wenyingd <wenyingd@vmware.com>
wenyingd added a commit to wenyingd/antrea that referenced this pull request Jun 1, 2022
…3542)

1. Add feature gate for ExternalNode which enables running Agent on a
   VM or BM.
2. Support Agent running APIServer only on localhost if it is not
   running on cluster worker Node.
3. CNIServer is loaded only when Agent is running on cluster worker
   Node.
4. Use a seperate build directory to generate agent configrations for
   ExternalNode.

Signed-off-by: wenyingd <wenyingd@vmware.com>
wenyingd added a commit to wenyingd/antrea that referenced this pull request Jun 1, 2022
…3542)

1. Add feature gate for ExternalNode which enables running Agent on a
   VM or BM.
2. Support Agent running APIServer only on localhost if it is not
   running on cluster worker Node.
3. CNIServer is loaded only when Agent is running on cluster worker
   Node.
4. Use a seperate build directory to generate agent configrations for
   ExternalNode.

Signed-off-by: wenyingd <wenyingd@vmware.com>
wenyingd added a commit to wenyingd/antrea that referenced this pull request Jun 1, 2022
…3542)

1. Add feature gate for ExternalNode which enables running Agent on a
   VM or BM.
2. Support Agent running APIServer only on localhost if it is not
   running on cluster worker Node.
3. CNIServer is loaded only when Agent is running on cluster worker
   Node.
4. Use a seperate build directory to generate agent configrations for
   ExternalNode.

Signed-off-by: wenyingd <wenyingd@vmware.com>
wenyingd added a commit to wenyingd/antrea that referenced this pull request Jun 9, 2022
…3542)

1. Add feature gate for ExternalNode which enables running Agent on a
   VM or BM.
2. Support Agent running APIServer only on localhost if it is not
   running on cluster worker Node.
3. CNIServer is loaded only when Agent is running on cluster worker
   Node.
4. Use a seperate build directory to generate agent configrations for
   ExternalNode.

Signed-off-by: wenyingd <wenyingd@vmware.com>
wenyingd added a commit to wenyingd/antrea that referenced this pull request Jun 27, 2022
…3542)

1. Add feature gate for ExternalNode which enables running Agent on a
   VM or BM.
2. Support Agent running APIServer only on localhost if it is not
   running on cluster worker Node.
3. CNIServer is loaded only when Agent is running on cluster worker
   Node.
4. Use a seperate build directory to generate agent configrations for
   ExternalNode.

Signed-off-by: wenyingd <wenyingd@vmware.com>
wenyingd added a commit to wenyingd/antrea that referenced this pull request Jun 28, 2022
…3542)

1. Add feature gate for ExternalNode which enables running Agent on a
   VM or BM.
2. Support Agent running APIServer only on localhost if it is not
   running on cluster worker Node.
3. CNIServer is loaded only when Agent is running on cluster worker
   Node.
4. Use a seperate build directory to generate agent configrations for
   ExternalNode.

Signed-off-by: wenyingd <wenyingd@vmware.com>
wenyingd added a commit to wenyingd/antrea that referenced this pull request Jun 28, 2022
…3542)

1. Add feature gate for ExternalNode which enables running Agent on a
   VM or BM.
2. Support Agent running APIServer only on localhost if it is not
   running on cluster worker Node.
3. CNIServer is loaded only when Agent is running on cluster worker
   Node.
4. Use a seperate build directory to generate agent configrations for
   ExternalNode.

Signed-off-by: wenyingd <wenyingd@vmware.com>
wenyingd added a commit to wenyingd/antrea that referenced this pull request Jul 12, 2022
…3542)

1. Add feature gate for ExternalNode which enables running Agent on a
   VM or BM.
2. Support Agent running APIServer only on localhost if it is not
   running on cluster worker Node.
3. CNIServer is loaded only when Agent is running on cluster worker
   Node.
4. Use a seperate build directory to generate agent configrations for
   ExternalNode.

Signed-off-by: wenyingd <wenyingd@vmware.com>
wenyingd added a commit to wenyingd/antrea that referenced this pull request Jul 27, 2022
…3542)

1. Add feature gate for ExternalNode which enables running Agent on a
   VM or BM.
2. Support Agent running APIServer only on localhost if it is not
   running on cluster worker Node.
3. CNIServer is loaded only when Agent is running on cluster worker
   Node.
4. Use a seperate build directory to generate agent configrations for
   ExternalNode.

Signed-off-by: wenyingd <wenyingd@vmware.com>
mengdie-song pushed a commit to mengdie-song/antrea that referenced this pull request Aug 3, 2022
…3542)

1. Add feature gate for ExternalNode which enables running Agent on a
   VM or BM.
2. Support Agent running APIServer only on localhost if it is not
   running on cluster worker Node.
3. CNIServer is loaded only when Agent is running on cluster worker
   Node.
4. Use a seperate build directory to generate agent configrations for
   ExternalNode.

Signed-off-by: wenyingd <wenyingd@vmware.com>
mengdie-song pushed a commit to mengdie-song/antrea that referenced this pull request Aug 3, 2022
…3542)

1. Add feature gate for ExternalNode which enables running Agent on a
   VM or BM.
2. Support Agent running APIServer only on localhost if it is not
   running on cluster worker Node.
3. CNIServer is loaded only when Agent is running on cluster worker
   Node.
4. Use a seperate build directory to generate agent configrations for
   ExternalNode.

Signed-off-by: wenyingd <wenyingd@vmware.com>
mengdie-song pushed a commit to mengdie-song/antrea that referenced this pull request Aug 3, 2022
…3542)

1. Add feature gate for ExternalNode which enables running Agent on a
   VM or BM.
2. Support Agent running APIServer only on localhost if it is not
   running on cluster worker Node.
3. CNIServer is loaded only when Agent is running on cluster worker
   Node.
4. Use a seperate build directory to generate agent configrations for
   ExternalNode.

Signed-off-by: wenyingd <wenyingd@vmware.com>
mengdie-song pushed a commit to mengdie-song/antrea that referenced this pull request Aug 8, 2022
…3542)

1. Add feature gate for ExternalNode which enables running Agent on a
   VM or BM.
2. Support Agent running APIServer only on localhost if it is not
   running on cluster worker Node.
3. CNIServer is loaded only when Agent is running on cluster worker
   Node.
4. Use a seperate build directory to generate agent configrations for
   ExternalNode.

Signed-off-by: wenyingd <wenyingd@vmware.com>
mengdie-song pushed a commit to mengdie-song/antrea that referenced this pull request Aug 8, 2022
…3542)

1. Add feature gate for ExternalNode which enables running Agent on a
   VM or BM.
2. Support Agent running APIServer only on localhost if it is not
   running on cluster worker Node.
3. CNIServer is loaded only when Agent is running on cluster worker
   Node.
4. Use a seperate build directory to generate agent configrations for
   ExternalNode.

Signed-off-by: wenyingd <wenyingd@vmware.com>
mengdie-song pushed a commit to mengdie-song/antrea that referenced this pull request Aug 8, 2022
…3542)

1. Add feature gate for ExternalNode which enables running Agent on a
   VM or BM.
2. Support Agent running APIServer only on localhost if it is not
   running on cluster worker Node.
3. CNIServer is loaded only when Agent is running on cluster worker
   Node.
4. Use a seperate build directory to generate agent configrations for
   ExternalNode.

Signed-off-by: wenyingd <wenyingd@vmware.com>
mengdie-song pushed a commit to mengdie-song/antrea that referenced this pull request Aug 8, 2022
…3542)

1. Add feature gate for ExternalNode which enables running Agent on a
   VM or BM.
2. Support Agent running APIServer only on localhost if it is not
   running on cluster worker Node.
3. CNIServer is loaded only when Agent is running on cluster worker
   Node.
4. Use a seperate build directory to generate agent configrations for
   ExternalNode.

Signed-off-by: wenyingd <wenyingd@vmware.com>
wenyingd added a commit to wenyingd/antrea that referenced this pull request Aug 10, 2022
…3542)

1. Add feature gate for ExternalNode which enables running Agent on a
   VM or BM.
2. Support Agent running APIServer only on localhost if it is not
   running on cluster worker Node.
3. CNIServer is loaded only when Agent is running on cluster worker
   Node.
4. Use a seperate build directory to generate agent configrations for
   ExternalNode.

Signed-off-by: wenyingd <wenyingd@vmware.com>
wenyingd added a commit to wenyingd/antrea that referenced this pull request Aug 11, 2022
…3542)

1. Add feature gate for ExternalNode which enables running Agent on a
   VM or BM.
2. Support Agent running APIServer only on localhost if it is not
   running on cluster worker Node.
3. CNIServer is loaded only when Agent is running on cluster worker
   Node.
4. Use a seperate build directory to generate agent configrations for
   ExternalNode.

Signed-off-by: wenyingd <wenyingd@vmware.com>
heanlan pushed a commit to heanlan/antrea that referenced this pull request Mar 29, 2023
…3542)

1. Add feature gate for ExternalNode which enables running Agent on a
   VM or BM.
2. Support Agent running APIServer only on localhost if it is not
   running on cluster worker Node.
3. CNIServer is loaded only when Agent is running on cluster worker
   Node.
4. Use a seperate build directory to generate agent configrations for
   ExternalNode.

Signed-off-by: wenyingd <wenyingd@vmware.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants