Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Antrea Agent failing to start on Windows #3636

Closed
antoninbas opened this issue Apr 14, 2022 · 3 comments · Fixed by #3641
Closed

Antrea Agent failing to start on Windows #3636

antoninbas opened this issue Apr 14, 2022 · 3 comments · Fixed by #3641
Assignees
Labels
area/OS/windows Issues or PRs related to the Windows operating system. kind/bug Categorizes issue or PR as related to a bug.

Comments

@antoninbas
Copy link
Contributor

Describe the bug
I'm trying to run Antrea on a Windows Node, but the agents gets into a CrashLoopBackOff state, with the following error:

ubuntu@ip-10-0-0-25:~$ kubectl  -n kube-system logs antrea-agent-windows-mgt4j -f

    Directory: C:\host\k\antrea

Mode                 LastWriteTime         Length Name
----                 -------------         ------ ----
d----           4/14/2022  1:32 AM                bin
I0414 01:38:38.499059    4960 log_file.go:99] Set log file max size to 104857600
I0414 01:38:38.560935    4960 agent.go:85] Starting Antrea agent (version v1.6.0)
I0414 01:38:38.560935    4960 client.go:81] No kubeconfig file was specified. Falling back to in-cluster config
W0414 01:38:38.568126    4960 env.go:83] Environment variable POD_NAMESPACE not found
W0414 01:38:38.569546    4960 env.go:121] Failed to get Pod Namespace from environment. Using "kube-system" as the Antrea Service Namespace
I0414 01:38:38.570210    4960 prometheus.go:171] Initializing prometheus metrics
I0414 01:38:38.570210    4960 ovs_client.go:68] Connecting to OVSDB at address \\.\pipe\C:openvswitchvarrunopenvswitchdb.sock
I0414 01:38:38.570850    4960 agent.go:340] Setting up node network
I0414 01:38:38.607958    4960 agent.go:894] "Setting Node MTU" MTU=8951
I0414 01:38:43.728745    4960 net_windows.go:375] "Creating HNSNetwork" name="antrea-hnsnetwork" subnet="192.168.1.0/24" nodeIP="10.0.0.44/24" adapter=&{Index:10 MTU:9001 Name:Ethernet HardwareAddr:06:05:9d:c4:a4:11 Flags:up|broadcast|multicast}
F0414 01:38:52.060950    4960 main.go:58] Error running agent: error initializing agent: route ip+net: no such network interface

The error doesn't really help me troubleshoot anything.

Versions:
Antrea v1.6.0

Additional context:
I am using Docker as the container runtime, and using the antrea-windows DaemonSet to run the agent.

Here is some additional information:

PS C:\k\antrea> Get-NetAdapter

Name                      InterfaceDescription                    ifIndex Status       MacAddress             LinkSpeed
----                      --------------------                    ------- ------       ----------             ---------
Ethernet                  AWS PV Network Device #0                     10 Up           06-05-9D-C4-A4-11         1 Gbps
vEthernet (HNS Interna... Hyper-V Virtual Ethernet Adapter #3          21 Up           00-15-5D-00-2C-00        10 Gbps
vEthernet (Ethernet)                                                   26 Up           00-15-5D-24-6D-AE        10 Gbps
vEthernet (Ethernet) 2                                                 31 Up           00-15-5D-24-66-0B        10 Gbps
vEthernet (15fdf8c8ade... Hyper-V Virtual Ethernet Adapter #2          14 Up           00-15-5D-24-64-BF        10 Gbps
vEthernet (nat)           Hyper-V Virtual Ethernet Adapter              9 Up           00-15-5D-A4-41-FB        10 Gbps
PS C:\k\antrea> Get-NetIPAddress


IPAddress         : fe80::30dc:8c11:df82:9759%21
InterfaceIndex    : 21
InterfaceAlias    : vEthernet (HNS Internal NIC)
AddressFamily     : IPv6
Type              : Unicast
PrefixLength      : 64
PrefixOrigin      : WellKnown
SuffixOrigin      : Link
AddressState      : Preferred
ValidLifetime     : Infinite ([TimeSpan]::MaxValue)
PreferredLifetime : Infinite ([TimeSpan]::MaxValue)
SkipAsSource      : False
PolicyStore       : ActiveStore

IPAddress         : fe80::edfe:65a7:d11d:4582%14
InterfaceIndex    : 14
InterfaceAlias    : vEthernet (15fdf8c8ade9b53)
AddressFamily     : IPv6
Type              : Unicast
PrefixLength      : 64
PrefixOrigin      : WellKnown
SuffixOrigin      : Link
AddressState      : Preferred
ValidLifetime     : Infinite ([TimeSpan]::MaxValue)
PreferredLifetime : Infinite ([TimeSpan]::MaxValue)
SkipAsSource      : False
PolicyStore       : ActiveStore

IPAddress         : fe80::8852:da76:bed1:9a5e%10
InterfaceIndex    : 10
InterfaceAlias    : Ethernet
AddressFamily     : IPv6
Type              : Unicast
PrefixLength      : 64
PrefixOrigin      : WellKnown
SuffixOrigin      : Link
AddressState      : Preferred
ValidLifetime     : Infinite ([TimeSpan]::MaxValue)
PreferredLifetime : Infinite ([TimeSpan]::MaxValue)
SkipAsSource      : False
PolicyStore       : ActiveStore

IPAddress         : fe80::1401:b10e:fa43:5b49%9
InterfaceIndex    : 9
InterfaceAlias    : vEthernet (nat)
AddressFamily     : IPv6
Type              : Unicast
PrefixLength      : 64
PrefixOrigin      : WellKnown
SuffixOrigin      : Link
AddressState      : Preferred
ValidLifetime     : Infinite ([TimeSpan]::MaxValue)
PreferredLifetime : Infinite ([TimeSpan]::MaxValue)
SkipAsSource      : False
PolicyStore       : ActiveStore

IPAddress         : ::1
InterfaceIndex    : 1
InterfaceAlias    : Loopback Pseudo-Interface 1
AddressFamily     : IPv6
Type              : Unicast
PrefixLength      : 128
PrefixOrigin      : WellKnown
SuffixOrigin      : WellKnown
AddressState      : Preferred
ValidLifetime     : Infinite ([TimeSpan]::MaxValue)
PreferredLifetime : Infinite ([TimeSpan]::MaxValue)
SkipAsSource      : False
PolicyStore       : ActiveStore

IPAddress         : 10.107.88.187
InterfaceIndex    : 21
InterfaceAlias    : vEthernet (HNS Internal NIC)
AddressFamily     : IPv4
Type              : Unicast
PrefixLength      : 8
PrefixOrigin      : Manual
SuffixOrigin      : Manual
AddressState      : Preferred
ValidLifetime     : Infinite ([TimeSpan]::MaxValue)
PreferredLifetime : Infinite ([TimeSpan]::MaxValue)
SkipAsSource      : False
PolicyStore       : ActiveStore

IPAddress         : 10.96.0.10
InterfaceIndex    : 21
InterfaceAlias    : vEthernet (HNS Internal NIC)
AddressFamily     : IPv4
Type              : Unicast
PrefixLength      : 8
PrefixOrigin      : Manual
SuffixOrigin      : Manual
AddressState      : Preferred
ValidLifetime     : Infinite ([TimeSpan]::MaxValue)
PreferredLifetime : Infinite ([TimeSpan]::MaxValue)
SkipAsSource      : False
PolicyStore       : ActiveStore

IPAddress         : 10.96.0.1
InterfaceIndex    : 21
InterfaceAlias    : vEthernet (HNS Internal NIC)
AddressFamily     : IPv4
Type              : Unicast
PrefixLength      : 8
PrefixOrigin      : Manual
SuffixOrigin      : Manual
AddressState      : Preferred
ValidLifetime     : Infinite ([TimeSpan]::MaxValue)
PreferredLifetime : Infinite ([TimeSpan]::MaxValue)
SkipAsSource      : False
PolicyStore       : ActiveStore

IPAddress         : 172.22.64.1
InterfaceIndex    : 14
InterfaceAlias    : vEthernet (15fdf8c8ade9b53)
AddressFamily     : IPv4
Type              : Unicast
PrefixLength      : 20
PrefixOrigin      : Manual
SuffixOrigin      : Manual
AddressState      : Preferred
ValidLifetime     : Infinite ([TimeSpan]::MaxValue)
PreferredLifetime : Infinite ([TimeSpan]::MaxValue)
SkipAsSource      : False
PolicyStore       : ActiveStore

IPAddress         : 10.0.0.44
InterfaceIndex    : 10
InterfaceAlias    : Ethernet
AddressFamily     : IPv4
Type              : Unicast
PrefixLength      : 24
PrefixOrigin      : Dhcp
SuffixOrigin      : Dhcp
AddressState      : Preferred
ValidLifetime     : 00:57:42
PreferredLifetime : 00:57:42
SkipAsSource      : False
PolicyStore       : ActiveStore

IPAddress         : 172.26.16.1
InterfaceIndex    : 9
InterfaceAlias    : vEthernet (nat)
AddressFamily     : IPv4
Type              : Unicast
PrefixLength      : 20
PrefixOrigin      : Manual
SuffixOrigin      : Manual
AddressState      : Preferred
ValidLifetime     : Infinite ([TimeSpan]::MaxValue)
PreferredLifetime : Infinite ([TimeSpan]::MaxValue)
SkipAsSource      : False
PolicyStore       : ActiveStore

IPAddress         : 127.0.0.1
InterfaceIndex    : 1
InterfaceAlias    : Loopback Pseudo-Interface 1
AddressFamily     : IPv4
Type              : Unicast
PrefixLength      : 8
PrefixOrigin      : WellKnown
SuffixOrigin      : WellKnown
AddressState      : Preferred
ValidLifetime     : Infinite ([TimeSpan]::MaxValue)
PreferredLifetime : Infinite ([TimeSpan]::MaxValue)
SkipAsSource      : False
PolicyStore       : ActiveStore
PS C:\k\antrea> Get-ComputerInfo | select WindowsProductName, WindowsVersion, OsHardwareAbstractionLayer
>>

WindowsProductName             WindowsVersion OsHardwareAbstractionLayer
------------------             -------------- --------------------------
Windows Server 2019 Datacenter 1809           10.0.17763.2686
@antoninbas antoninbas added kind/bug Categorizes issue or PR as related to a bug. area/OS/windows Issues or PRs related to the Windows operating system. labels Apr 14, 2022
@antoninbas
Copy link
Contributor Author

I created a brand new Windows Node and ran into the exact same issue again after following the steps in https://github.com/antrea-io/antrea/blob/main/docs/windows.md#installation-via-wins-docker-based-runtimes

@wenyingd
Copy link
Contributor

wenyingd commented Apr 14, 2022

Hi @antoninbas , I saw the Windows Node network adapter name was "Ethernet" which was configured with NodeIP address, and there are exsiting netadapter named as "vEthernet (Ethernet)". The crash happens after creating HNSNetwork. By default, Windows host should create a new virtual management network adpater with a name format "vEthernet ($netadapter_name)", and we use this format to check the existence of the virtual management adapter after creating the HNSNetwork. In this testbed, the name is used by another adapter, but Antrea doesn't know it so uses a wrong name to continue the following steps, this generate the issues.

I think a fix could be use the IP and MAC to find the virtual adapter instead of the name.

@wenyingd wenyingd self-assigned this Apr 14, 2022
@antoninbas
Copy link
Contributor Author

@wenyingd Thanks for the quick reply. Your explanation makes sense.

I ran some more experiments and it seems that this network adapter (vEthernet (Ethernet)) is created after the Node joins the cluster and the kube-proxy hostNetwork Pod is scheduled on the Node (at this point I haven't created the antrea-windows DaemonSet yet).

I am using an AWS AMI named Microsoft Windows Server 2019 Base with Containers which comes with Windows container support out-of-the-box.

It seems that this support is based on Mirantis Container Runtime:

PS C:\k\antrea> docker version
Client: Mirantis Container Runtime
 Version:           20.10.9
 API version:       1.41
 Go version:        go1.16.12m2
 Git commit:        591094d
 Built:             12/21/2021 21:34:30
 OS/Arch:           windows/amd64
 Context:           default
 Experimental:      true

Server: Mirantis Container Runtime
 Engine:
  Version:          20.10.9
  API version:      1.41 (minimum version 1.24)
  Go version:       go1.16.12m2
  Git commit:       9b96ce992b
  Built:            12/21/2021 21:33:06
  OS/Arch:          windows/amd64
  Experimental:     false

I haven't tried running Antrea on Windows in 1 year, so I think quite a lot of things have changed since then.

It seems that the Mirantis Container Runtime may be responsible for creating this network adapter. Or maybe I am totally wrong :)

But it seems that it would be a good idea not to assume that there is no existing adapter with the vEthernet (Ethernet) name. When you have a patch, I can try it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/OS/windows Issues or PRs related to the Windows operating system. kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants