Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Virtual network device's MAC address changes after it's created with systemd v242 and later #4426

Closed
tnqn opened this issue Nov 30, 2022 · 0 comments · Fixed by #4428
Closed
Assignees
Labels
kind/bug Categorizes issue or PR as related to a bug. priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. reported-by/end-user Issues reported by end users.

Comments

@tnqn
Copy link
Member

tnqn commented Nov 30, 2022

Describe the bug

@wyike reported an issue that a Node sometimes can't talk to Pods running on itself when running Antrea with Ubuntu jammy (22.04.1) version. After some troubleshooting, we found it's because antrea-gw0's MAC address was different from the one used in OpenFlow rules. For example, it was be:4d:3d:c8:de:94 according to ip addr:

5: antrea-gw0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UNKNOWN group default qlen 1000
    link/ether be:4d:3d:c8:de:94 brd ff:ff:ff:ff:ff:ff
    inet 10.200.0.1/24 brd 10.200.0.255 scope global antrea-gw0
       valid_lft forever preferred_lft forever

But the agent log said it was 4a:d4:c3:c7:5b:68 after the device was created.

2022-11-25T13:34:05.376099426Z stderr F I1125 13:34:05.374980       1 agent.go:425] Agent initialized 
NodeConfig=NodeName: 2dbf45a6-c57c-4eed-bd14-657109dc6477, OVSBridge: br-int, PodIPv4CIDR: 10.200.0.0/24, 
PodIPv6CIDR: <nil>, NodeIPv4: 10.0.16.13/22, NodeIPv6: <nil>, TransportIPv4: 10.0.16.13/22, TransportIPv6: <nil>, Gateway: 
Name antrea-gw0: IPv4 10.200.0.1, IPv6 <nil>, MAC 4a:d4:c3:c7:5b:68, NetworkConfig=&{encap geneve None {psk }  [] true false}

The inconsistency caused the packets were dropped by spoofguard flows.

@wenyingd and I investigated the reason why the MAC address changes and found that systemd-udevd always generated a new MAC address after a virtual device was created, on Ubuntu 22.04.1 and later version. This is the default behavior defined by /usr/lib/systemd/network/99-default.link:

[Match]
OriginalName=*

[Link]
NamePolicy=keep kernel database onboard slot path
AlternativeNamesPolicy=database onboard slot path
MACAddressPolicy=persistent

The policy asked udevd to create a persistent MAC address for all devices, and it happened asynchronously. So if the MAC address was updated before antrea-agent read the device's MAC, everything would be fine, otherwise the issue would be encountered.

The reason why it only happened on Ubuntu jammy version and later is:

  1. The persistent MACAddressPolicy has been there for a long time, so udev has always been trying to generate a new MAC for virtual devices, but generating persistent MAC for virtual device was not supported until systemd v242.
  2. With Allow MACAddressPolicy=persistent for all virtual devices systemd/systemd#11382, systemd v242 and later releases support generating persistent MAC for virtual devices.
  3. Ubuntu Focal (20.04) was shipped with systemd v245 so should have encountered the issue. However, its custom systemd 245.4-4ubuntu3.19 package applied a patch which partially reverted Allow MACAddressPolicy=persistent for all virtual devices systemd/systemd#11382, see "Skip falling back to device name when net_get_name(device) fails" in https://launchpad.net/ubuntu/+source/systemd/245.4-2ubuntu1.
  4. Ubuntu Jammy (22.04) was shipped with systemd v249 and without the revert patch, so generating persistent MAC for virtual device always succeeds.

Although the issue was observed with antrea-gw0 only, it could happen to all virtual devices, including the container network devices. As it affects Ubuntu jammy and other OSes with systemd v242+, we need to fix it ASAP and backport the fix.

Expected

antrea-agent should populate a MAC address and set it when creating antrea-gw0 and veth devices to avoid udev regenerating another persistent MAC address.

Actual behavior

On Ubuntu jammy or other OS with systemd v242+, udevd will regenerated a persistent MAC address and cause connection issue.

Versions:

OS: Ubuntu jammy or systemd v242+

@tnqn tnqn added kind/bug Categorizes issue or PR as related to a bug. priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. labels Nov 30, 2022
@tnqn tnqn added this to the Antrea v1.10 release milestone Nov 30, 2022
@tnqn tnqn closed this as completed in #4428 Dec 6, 2022
@tnqn tnqn added the reported-by/end-user Issues reported by end users. label Dec 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. reported-by/end-user Issues reported by end users.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants