Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: add dhcp discover packet to linux swiftv2 to ensure mappings exist for dns #2989

Merged
merged 14 commits into from
Sep 10, 2024

Conversation

QxBytes
Copy link
Contributor

@QxBytes QxBytes commented Sep 4, 2024

Reason for Change:

In linux swiftv2, nslookup using the wireserver may fail in certain scenarios. The theory is that this is because the host does not have mappings/state for the dns on the secondary/delegated nics (eth1). To ensure that state is created, we send a dhcp discover packet through the delegated nic (eth1) so the mappings are created. Any offers returned are ignored.

Issue Fixed:

See above.

Requirements:

Notes:
We assume that epInfo.IfName contains the actual interface name of the nic! The dhcp discover packet MUST get a successful offer or the add call will fail. If in swiftv2 we have multiple nics, we send the dhcp request for each one and all must succeed or the add call will fail. Ensure the ifName is in epInfo.ifName

Should affect Swiftv2 Linux only.
 
Reviewer(s):

  • Code includes goroutines/synchronization (potential data races?)
  • Verify indexing is correct/no possibility of panics/accessing out of bounds
  • Code deals with sockets (verify all sockets always properly closed)

Tested:

  • Swiftv2 linux create mt pod 1.6.5 cns (we do get a reply from the dhcp server), logs corroborate
  • Attempt request with wrong mac address, we time out and error as expected
  • Attempt request with wrong ifname, we time out and error as expected
  • Context deadline error as expected if set context deadline to 0 sec
  • Repro'd issue, then applied CNI fix, then recreated pod to issue dhcp request, and then I can contact dns via secondary nic as expected (previously I couldn't)

@QxBytes QxBytes self-assigned this Sep 4, 2024
@QxBytes QxBytes added cni Related to CNI. fix Fixes something. linux swift-v2 labels Sep 4, 2024
@QxBytes QxBytes marked this pull request as ready for review September 5, 2024 23:43
@QxBytes QxBytes requested a review from a team as a code owner September 5, 2024 23:43
@QxBytes QxBytes requested a review from k-routhu September 5, 2024 23:43
dhcp/dhcp.go Outdated Show resolved Hide resolved
dhcp/dhcp.go Outdated Show resolved Hide resolved
dhcp/dhcp.go Outdated Show resolved Hide resolved
dhcp/dhcp.go Outdated Show resolved Hide resolved
@QxBytes QxBytes force-pushed the alew/dhcp branch 2 times, most recently from 8f8075b to 469ef99 Compare September 6, 2024 22:46
dhcp/dhcp.go Outdated Show resolved Hide resolved
dhcp/dhcp.go Outdated Show resolved Hide resolved
dhcp/dhcp.go Outdated Show resolved Hide resolved
tamilmani1989
tamilmani1989 previously approved these changes Sep 9, 2024
@QxBytes QxBytes enabled auto-merge September 9, 2024 22:30
@QxBytes
Copy link
Contributor Author

QxBytes commented Sep 9, 2024

/azp run Azure Container Networking PR

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@QxBytes QxBytes requested a review from a team as a code owner September 9, 2024 23:36
@QxBytes QxBytes force-pushed the alew/dhcp branch 3 times, most recently from 7220bea to e683b1d Compare September 9, 2024 23:48
@QxBytes
Copy link
Contributor Author

QxBytes commented Sep 9, 2024

/azp run Azure Container Networking PR

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@QxBytes
Copy link
Contributor Author

QxBytes commented Sep 10, 2024

/azp run Azure Container Networking PR

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@QxBytes
Copy link
Contributor Author

QxBytes commented Sep 10, 2024

/azp run Azure Container Networking PR

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@QxBytes QxBytes added this pull request to the merge queue Sep 10, 2024
Merged via the queue into master with commit 47b4329 Sep 10, 2024
93 of 94 checks passed
@QxBytes QxBytes deleted the alew/dhcp branch September 10, 2024 19:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cni Related to CNI. fix Fixes something. linux swift-v2
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants