Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cloud-init intermittent to fail rename network interface #5916

Open
rmhsawyer opened this issue Dec 6, 2024 · 0 comments
Open

cloud-init intermittent to fail rename network interface #5916

rmhsawyer opened this issue Dec 6, 2024 · 0 comments
Labels
bug Something isn't working correctly new An issue that still needs triage

Comments

@rmhsawyer
Copy link
Contributor

rmhsawyer commented Dec 6, 2024

Bug report

On Azurelinux with cloud-init 24.3 we are running test and seeing intermittent issue where cloud-init failed to rename the network interface. I found it is may due to a racing condition between cloud-init and systemd-networkd. Cloud-init renaming command: cloud-init failed to run cmd ['ip', 'link', 'set', 'eth7', 'name', 'eth1'] due to there is still an eth1 exists after cloud-init already renaming eth1 to cirename0. Cloud-init gets eventually errors out due to unexpected two eth1 exist. From the systemd logs I am seeing systemd and cloud-init trying to apply the network config and renaming at same time so I suspect this is how the race condition came along:
image

Steps to reproduce the problem

Intermittent test failed where multiple nics were populated

Environment details

  • Cloud-init version: 24.3
  • Operating System Distribution: azurelinux
  • Cloud provider, platform or installer type: Azure host

cloud-init logs

2024-11-18 08:15:04,840 - subp.py[DEBUG]: Running command ['ip', '-6', 'addr', 'show', 'permanent', 'scope', 'global'] with allowed return codes [0] (shell=False, capture=True)
2024-11-18 08:15:04,842 - subp.py[DEBUG]: Running command ['ip', '-4', 'addr', 'show'] with allowed return codes [0] (shell=False, capture=True)
2024-11-18 08:15:04,843 - net[DEBUG]: Detected interfaces {'eth6': {'downable': True, 'device_id': '0x3', 'driver': 'hv_netvsc', 'mac': '60:45:bd:ae:33:e3', 'name': 'eth6', 'up': False}, 'eth4': {'downable': True, 'device_id': '0x3', 'driver': 'hv_netvsc', 'mac': '60:45:bd:ae:38:a6', 'name': 'eth4', 'up': False}, 'lo': {'downable': False, 'device_id': None, 'driver': None, 'mac': '00:00:00:00:00:00', 'name': 'lo', 'up': True}, 'eth2': {'downable': True, 'device_id': '0x3', 'driver': 'hv_netvsc', 'mac': '60:45:bd:ae:3d:44', 'name': 'eth2', 'up': False}, 'eth0': {'downable': True, 'device_id': '0x3', 'driver': 'hv_netvsc', 'mac': '60:45:bd:ae:38:38', 'name': 'eth0', 'up': False}, 'eth7': {'downable': True, 'device_id': '0x3', 'driver': 'hv_netvsc', 'mac': '60:45:bd:ae:3d:59', 'name': 'eth7', 'up': False}, 'eth5': {'downable': True, 'device_id': '0x3', 'driver': 'hv_netvsc', 'mac': '60:45:bd:ae:3b:fe', 'name': 'eth5', 'up': False}, 'eth3': {'downable': True, 'device_id': '0x3', 'driver': 'hv_netvsc', 'mac': '60:45:bd:ae:3a:c8', 'name': 'eth3', 'up': False}, 'eth1': {'downable': True, 'device_id': '0x3', 'driver': 'hv_netvsc', 'mac': '60:45:bd:ae:3b:81', 'name': 'eth1', 'up': False}}
2024-11-18 08:15:04,844 - net[DEBUG]: Renamed [['60:45:bd:ae:38:38', 'eth0', 'hv_netvsc', '0x3'], ['60:45:bd:ae:3d:59', 'eth1', 'hv_netvsc', '0x3'], ['60:45:bd:ae:3b:fe', 'eth2', 'hv_netvsc', '0x3'], ['60:45:bd:ae:3d:44', 'eth3', 'hv_netvsc', '0x3'], ['60:45:bd:ae:3a:c8', 'eth4', 'hv_netvsc', '0x3'], ['60:45:bd:ae:38:a6', 'eth5', 'hv_netvsc', '0x3'], ['60:45:bd:ae:33:e3', 'eth6', 'hv_netvsc', '0x3'], ['60:45:bd:ae:3b:81', 'eth7', 'hv_netvsc', '0x3']] with ops [('rename', '60:45:bd:ae:3d:59', 'eth1', ('eth1', 'cirename0')), ('rename', '60:45:bd:ae:3d:59', 'eth1', ('eth7', 'eth1')), ('rename', '60:45:bd:ae:3b:fe', 'eth2', ('eth2', 'cirename1')), ('rename', '60:45:bd:ae:3b:fe', 'eth2', ('eth5', 'eth2')), ('rename', '60:45:bd:ae:3d:44', 'eth3', ('eth3', 'cirename2')), ('rename', '60:45:bd:ae:3d:44', 'eth3', ('cirename1', 'eth3')), ('rename', '60:45:bd:ae:3a:c8', 'eth4', ('eth4', 'cirename3')), ('rename', '60:45:bd:ae:3a:c8', 'eth4', ('cirename2', 'eth4')), ('rename', '60:45:bd:ae:38:a6', 'eth5', ('cirename3', 'eth5')), ('rename', '60:45:bd:ae:3b:81', 'eth7', ('cirename0', 'eth7'))]
2024-11-18 08:15:04,844 - subp.py[DEBUG]: Running command ['ip', 'link', 'set', 'eth1', 'name', 'cirename0'] with allowed return codes [0] (shell=False, capture=True)
2024-11-18 08:15:04,876 - subp.py[DEBUG]: Running command ['ip', 'link', 'set', 'eth7', 'name', 'eth1'] with allowed return codes [0] (shell=False, capture=True)
2024-11-18 08:15:04,880 - subp.py[DEBUG]: Running command ['ip', 'link', 'set', 'eth2', 'name', 'cirename1'] with allowed return codes [0] (shell=False, capture=True)
2024-11-18 08:15:04,912 - subp.py[DEBUG]: Running command ['ip', 'link', 'set', 'eth5', 'name', 'eth2'] with allowed return codes [0] (shell=False, capture=True)
2024-11-18 08:15:04,944 - subp.py[DEBUG]: Running command ['ip', 'link', 'set', 'eth3', 'name', 'cirename2'] with allowed return codes [0] (shell=False, capture=True)
2024-11-18 08:15:04,972 - subp.py[DEBUG]: Running command ['ip', 'link', 'set', 'cirename1', 'name', 'eth3'] with allowed return codes [0] (shell=False, capture=True)
2024-11-18 08:15:05,008 - subp.py[DEBUG]: Running command ['ip', 'link', 'set', 'eth4', 'name', 'cirename3'] with allowed return codes [0] (shell=False, capture=True)
2024-11-18 08:15:05,032 - subp.py[DEBUG]: Running command ['ip', 'link', 'set', 'cirename2', 'name', 'eth4'] with allowed return codes [0] (shell=False, capture=True)
2024-11-18 08:15:05,060 - subp.py[DEBUG]: Running command ['ip', 'link', 'set', 'cirename3', 'name', 'eth5'] with allowed return codes [0] (shell=False, capture=True)
2024-11-18 08:15:05,096 - subp.py[DEBUG]: Running command ['ip', 'link', 'set', 'cirename0', 'name', 'eth7'] with allowed return codes [0] (shell=False, capture=True)
2024-11-18 08:15:05,098 - stages.py[WARNING]: Failed to rename devices: Failed to apply network config names: [unknown] Error performing rename('eth7', 'eth1') for 60:45:bd:ae:3d:59, eth1: Unexpected error while running command.
Command: ['ip', 'link', 'set', 'eth7', 'name', 'eth1']
Exit code: 2
Reason: -
Stdout:
Stderr: RTNETLINK answers: File exists
[unknown] Error performing rename('cirename0', 'eth7') for 60:45:bd:ae:3b:81, eth7: Unexpected error while running command.
Command: ['ip', 'link', 'set', 'cirename0', 'name', 'eth7']
Exit code: 2
Reason: -
Stdout:
Stderr: RTNETLINK answers: File exists
cloud-init.log
Cloud-init -output.log

@rmhsawyer rmhsawyer added bug Something isn't working correctly new An issue that still needs triage labels Dec 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working correctly new An issue that still needs triage
Projects
None yet
Development

No branches or pull requests

1 participant