Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow live-migration for pod network in bridge mode #7768

Closed
wants to merge 1 commit into from

Conversation

kvaps
Copy link
Member

@kvaps kvaps commented May 18, 2022

This PR enables the live-migration feaure for pod networking in any mode.

The feature protected by feature gate NetworkAwareLiveMigration

The functionality has been added:

  • If MAC address changed: detach / attach interface after live migration to have correct MAC address set
  • If MAC address is not changed: link down / link up interface to force VM request new IP address and routes from DHCP

Full proposal is described here kubevirt/community#182

fixes #7238

What this PR does / why we need it:

This feature would allow to live-migrate the VMs with pod network in bridge and macvtap modes.

We're at Deckhouse developing virtualization solution based on KubeVirt and Cilium.
We'd like to make live-migration working for less latency modes (macvtap and bridge)

Which issue(s) this PR fixes

Special notes for your reviewer:

Slack thread: https://kubernetes.slack.com/archives/C8ED7RKFE/p1652296916563589?thread_ts=1650371688.593359&cid=C8ED7RKFE

Release note:

New feature gate NetworkAwareLiveMigration which allows live-migration for pod network connected in bridge mode

@kubevirt-bot kubevirt-bot added release-note Denotes a PR that will be considered when it comes time to generate release notes. dco-signoff: yes Indicates the PR's author has DCO signed all their commits. size/L needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels May 18, 2022
@kubevirt-bot
Copy link
Contributor

Hi @kvaps. Thanks for your PR.

I'm waiting for a kubevirt member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Copy link
Contributor

@maiqueb maiqueb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some initial comments.

Let's check out your new e2e test
/test pull-kubevirt-e2e-k8s-1.21-sig-network

@maiqueb
Copy link
Contributor

maiqueb commented May 18, 2022

Some more questions:

  • for which binding type would this apply to ? bridge ?...
  • did you consider IPAM (when bridge binding is used) in this scenario ? Like, inside the VMI, we will have the same IP address, but what about the pod interface ? Unless the CNI features some sort of sticky IP, once the lease expires, it will get a new IP - which might break the applications inside the VM. And this means that if the CNI features a sticky IP, you'll have 2 live pods with the same IP. This, iiuc, breaks the kubernetes networking model, where each pod can reach any other pod - i.e. which one would it reach ?...

@kvaps
Copy link
Member Author

kvaps commented May 18, 2022

for which binding type would this apply to ? bridge ?...

I tested this with bridge and macvtap

did you consider IPAM (when bridge binding is used) in this scenario ? Like, inside the VMI, we will have the same IP address, but what about the pod interface ? Unless the CNI features some sort of sticky IP, once the lease expires, it will get a new IP - which might break the applications inside the VM. And this means that if the CNI features a sticky IP, you'll have 2 live pods with the same IP. This, iiuc, breaks the kubernetes networking model, where each pod can reach any other pod - i.e. which one would it reach ?...

Yeah, both scenarios are supported:

  • If target pod has the same IP, then only routes will be updated.
  • If target pod has different IP, then a new IP will be assigned.

Both are done by setting link down / link up to force DHCP-client renew a lease.

I tested this with cilium/cilium#19789

@kubevirt-bot kubevirt-bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label May 28, 2022
@EdDev
Copy link
Member

EdDev commented May 28, 2022

@kvaps , this is a very interesting proposal.
The topic itself was raised multiple times and got stuck due to the corner cases which k8s brings in.

I think it will very useful to push this idea forward through a formal proposal.

I am interested to see the use cases where such an option fits better than the existing masquerade, a discussion on when and how to enable migrations in such setups and the technical details of how such a migration will work from the pod, vm and guest perspective.

At the last point regarding the tech details, I remember the challenge of having the mac address at the source and target pods active for the migration period.

Generally speaking, if this can fit a specific configuration (e.g. specific CNI, guests with/without dhcp, acceptable link-flickering on the vnic, etc), I see no reason why not to have it in as an option. The main risk and limitation is that future features/changes may break it (i.e. if activating a future enhancement will break this migration option, it may still be accepted in and you will not be able to use it).

I hope this can be discussed through a proposal.

@EdDev
Copy link
Member

EdDev commented May 28, 2022

/sig network
/cc @AlonaKaplan
/cc @phoracek

Copy link
Contributor

@maiqueb maiqueb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some code related nits, and requests for unit / e2e tests.

I agree with @EdDev - this would move forward faster if you presented a design proposal of this work.

@AlonaKaplan
Copy link
Member

Some code related nits, and requests for unit / e2e tests.

I agree with @EdDev - this would move forward faster if you presented a design proposal of this work.

@kvaps thanks for the PR. I agree with Miguel and Eddy that it would be much easier to continue the discussion over a proposal.
Previously @maiqueb suggested two different solutions to the same issue-

  1. Make the DHCP server advertise very short leases, thus causing the client to re-acquire the lease.
  2. Using FORCERENEW dhcp server option to force the client to move to the RENEW state.

It would be nice to explain why unlinking/linking the interface is better than one of those options.

Another question I would love to discuss over the proposal-

  • Why the migration is not allowed if the MAC address is bot persistent, but allowed in case the IP address is not persistent? What is the difference?

@kvaps
Copy link
Member Author

kvaps commented Jun 16, 2022

Sorry for long answer

Make the DHCP server advertise very short leases, thus causing the client to re-acquire the lease.

This is not working in every distribution, IIRC some old ubuntu and debians setting IP lifetime for infinity despite the fact DHCP provided lease just from shot amount of time.

Using FORCERENEW dhcp server option to force the client to move to the RENEW state.

This is still RFC and is not supported on some distributions and MS Windows for example

The link down/up method works perfectly in both cases

Why the migration is not allowed if the MAC address is bot persistent, but allowed in case the IP address is not persistent? What is the difference?

Well, we can force renewing IP configuration for the VM (this is not only about the IP address, but also routes).
But there is no easy way to force renewing MAC address inside the VM.

However I was also thinking about this, and I think we have two options:

  • We could use detach/attach interface
  • We could generate cloud-init config with mac address, and push it into VM somehow. But in this way it will also lead to link down/up inside the VM

I think detaching link for working virtual machine can be destructive action if application is binded on specific interface, not just 0.0.0.0, so we can provide this opportunity only by enabling specific feature flag

@kubevirt-bot kubevirt-bot added dco-signoff: no Indicates the PR's author has not DCO signed all their commits. and removed dco-signoff: yes Indicates the PR's author has DCO signed all their commits. labels Jun 17, 2022
@kvaps kvaps marked this pull request as draft June 17, 2022 15:30
@kubevirt-bot kubevirt-bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jun 17, 2022
@kvaps kvaps force-pushed the allow-live-migration branch 4 times, most recently from 723e19e to cdbd9d9 Compare June 18, 2022 08:23
@kvaps kvaps changed the base branch from release-0.58 to release-0.59 March 1, 2023 14:31
@kvaps kvaps force-pushed the allow-live-migration branch from 1a161d3 to d8fdb00 Compare March 1, 2023 14:31
@kubevirt-bot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign dhiller for approval by writing /assign @dhiller in a comment. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@kvaps
Copy link
Member Author

kvaps commented Mar 1, 2023

PR rebased on v0.59

old versions:

@kubevirt-bot
Copy link
Contributor

@kvaps: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-kubevirt-e2e-k8s-1.21-operator fb8bb98 link true /test pull-kubevirt-e2e-k8s-1.21-operator
pull-kubevirt-e2e-k8s-1.21-sig-compute-migrations b8e1f36 link true /test pull-kubevirt-e2e-k8s-1.21-sig-compute-migrations
pull-kubevirt-e2e-k8s-1.23-sig-compute-cgroupsv2 f545ae7 link true /test pull-kubevirt-e2e-k8s-1.23-sig-compute-cgroupsv2
pull-kubevirt-e2e-k8s-1.23-sig-storage-cgroupsv2 f545ae7 link true /test pull-kubevirt-e2e-k8s-1.23-sig-storage-cgroupsv2
pull-kubevirt-fossa b7ceec1 link false /test pull-kubevirt-fossa
pull-kubevirt-verify-rpms d9ec410 link false /test pull-kubevirt-verify-rpms
pull-kubevirt-e2e-k8s-1.23-sig-compute-nonroot d9ec410 link true /test pull-kubevirt-e2e-k8s-1.23-sig-compute-nonroot-0.56
pull-kubevirt-e2e-k8s-1.22-sig-compute-migrations d9ec410 link true /test pull-kubevirt-e2e-k8s-1.22-sig-compute-migrations-0.56
pull-kubevirt-e2e-k8s-1.23-sig-compute-migrations-nonroot d9ec410 link true /test pull-kubevirt-e2e-k8s-1.23-sig-compute-migrations-nonroot-0.56
pull-kubevirt-e2e-kind-1.22-sriov d9ec410 link true /test pull-kubevirt-e2e-kind-1.22-sriov-0.56
pull-kubevirt-e2e-k8s-1.23-sig-network-nonroot d9ec410 link true /test pull-kubevirt-e2e-k8s-1.23-sig-network-nonroot
pull-kubevirt-e2e-k8s-1.23-sig-storage-nonroot d9ec410 link true /test pull-kubevirt-e2e-k8s-1.23-sig-storage-nonroot
pull-kubevirt-e2e-k8s-1.24-sig-storage-dv-gc d9ec410 link true /test pull-kubevirt-e2e-k8s-1.24-sig-storage-dv-gc
pull-kubevirt-e2e-k8s-1.22-sig-performance d9ec410 link false /test pull-kubevirt-e2e-k8s-1.22-sig-performance
pull-kubevirt-e2e-kind-1.22-sriov-nonroot 1a161d3 link true /test pull-kubevirt-e2e-kind-1.22-sriov-nonroot-0.58
pull-kubevirt-e2e-kind-1.23-vgpu-nonroot 1a161d3 link true /test pull-kubevirt-e2e-kind-1.23-vgpu-nonroot-0.58
pull-kubevirt-unit-test-arm64 d8fdb00 link false /test pull-kubevirt-unit-test-arm64-0.59
pull-kubevirt-goveralls d8fdb00 link false /test pull-kubevirt-goveralls-0.59
pull-kubevirt-e2e-k8s-1.25-sig-storage-cgroupsv2 1a161d3 link true /test pull-kubevirt-e2e-k8s-1.25-sig-storage-cgroupsv2-0.59
pull-kubevirt-build d8fdb00 link true /test pull-kubevirt-build-0.59
pull-kubevirt-e2e-k8s-1.26-sig-network d8fdb00 link true /test pull-kubevirt-e2e-k8s-1.26-sig-network-0.59
pull-kubevirt-e2e-k8s-1.26-sig-operator-upgrade d8fdb00 link true /test pull-kubevirt-e2e-k8s-1.26-sig-operator-upgrade-0.59
pull-kubevirt-e2e-kind-1.23-sriov d8fdb00 link true /test pull-kubevirt-e2e-kind-1.23-sriov-0.59
pull-kubevirt-build-arm64 d8fdb00 link true /test pull-kubevirt-build-arm64-0.59
pull-kubevirt-e2e-windows2016 d8fdb00 link true /test pull-kubevirt-e2e-windows2016-0.59
pull-kubevirt-unit-test d8fdb00 link true /test pull-kubevirt-unit-test-0.59
pull-kubevirt-e2e-kind-1.23-vgpu d8fdb00 link true /test pull-kubevirt-e2e-kind-1.23-vgpu-0.59
pull-kubevirt-e2e-k8s-1.26-sig-operator-configuration d8fdb00 link true /test pull-kubevirt-e2e-k8s-1.26-sig-operator-configuration-0.59
pull-kubevirt-e2e-k8s-1.25-sig-compute-migrations-root d8fdb00 link true /test pull-kubevirt-e2e-k8s-1.25-sig-compute-migrations-root-0.59
pull-kubevirt-e2e-k8s-1.24-operator-root d8fdb00 link true /test pull-kubevirt-e2e-k8s-1.24-operator-root-0.59
pull-kubevirt-e2e-k8s-1.25-sig-compute d8fdb00 link true /test pull-kubevirt-e2e-k8s-1.25-sig-compute-0.59
pull-kubevirt-e2e-k8s-1.26-sig-compute d8fdb00 link true /test pull-kubevirt-e2e-k8s-1.26-sig-compute-0.59
pull-kubevirt-check-tests-for-flakes d8fdb00 link false /test pull-kubevirt-check-tests-for-flakes-0.59
pull-kubevirt-e2e-k8s-1.24-sig-compute d8fdb00 link true /test pull-kubevirt-e2e-k8s-1.24-sig-compute-0.59
pull-kubevirt-e2e-k8s-1.25-ipv6-sig-network d8fdb00 link true /test pull-kubevirt-e2e-k8s-1.25-ipv6-sig-network-0.59
pull-kubevirt-e2e-k8s-1.25-sig-operator d8fdb00 link true /test pull-kubevirt-e2e-k8s-1.25-sig-operator-0.59
pull-kubevirt-e2e-k8s-1.24-sig-storage d8fdb00 link true /test pull-kubevirt-e2e-k8s-1.24-sig-storage-0.59
pull-kubevirt-e2e-k8s-1.26-sig-monitoring 1a161d3 link true /test pull-kubevirt-e2e-k8s-1.26-sig-monitoring-0.59
pull-kubevirt-e2e-k8s-1.25-sig-network d8fdb00 link true /test pull-kubevirt-e2e-k8s-1.25-sig-network-0.59
pull-kubevirt-e2e-k8s-1.25-sig-compute-migrations d8fdb00 link true /test pull-kubevirt-e2e-k8s-1.25-sig-compute-migrations-0.59
pull-kubevirt-e2e-k8s-1.26-sig-storage d8fdb00 link true /test pull-kubevirt-e2e-k8s-1.26-sig-storage-0.59
pull-kubevirt-e2e-k8s-1.24-sig-storage-root d8fdb00 link true /test pull-kubevirt-e2e-k8s-1.24-sig-storage-root-0.59
pull-kubevirt-e2e-k8s-1.24-sig-compute-root d8fdb00 link true /test pull-kubevirt-e2e-k8s-1.24-sig-compute-root-0.59
pull-kubevirt-e2e-k8s-1.25-sig-storage d8fdb00 link true /test pull-kubevirt-e2e-k8s-1.25-sig-storage-0.59
pull-kubevirt-e2e-k8s-1.24-sig-network-root d8fdb00 link true /test pull-kubevirt-e2e-k8s-1.24-sig-network-root-0.59
pull-kubevirt-e2e-k8s-1.24-operator d8fdb00 link true /test pull-kubevirt-e2e-k8s-1.24-operator-0.59
pull-kubevirt-e2e-k8s-1.24-sig-network d8fdb00 link true /test pull-kubevirt-e2e-k8s-1.24-sig-network-0.59
pull-kubevirt-e2e-k8s-1.25-sig-performance d8fdb00 link true /test pull-kubevirt-e2e-k8s-1.25-sig-performance-0.59

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@kubevirt-bot
Copy link
Contributor

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

/lifecycle stale

@kubevirt-bot kubevirt-bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jun 13, 2023
@rthallisey
Copy link
Contributor

/remove-lifecycle stale

@kubevirt-bot kubevirt-bot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jun 13, 2023
@kvaps
Copy link
Member Author

kvaps commented Jul 26, 2023

I migrated my fork from kvaps/kubevirt to deckhouse/3p-kubevirt, now it will be supported by deckhouse project
fyi

@kubevirt-bot
Copy link
Contributor

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

/lifecycle stale

@kubevirt-bot kubevirt-bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Oct 24, 2023
@remram44
Copy link

Still wanted

@rthallisey
Copy link
Contributor

/remove-lifecycle stale

@kubevirt-bot kubevirt-bot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Oct 24, 2023
Copy link
Member

@xpivarc xpivarc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Never mind I see the problem

@kubevirt-bot
Copy link
Contributor

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

/lifecycle stale

@kubevirt-bot kubevirt-bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 22, 2024
@kubevirt-bot
Copy link
Contributor

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

/lifecycle rotten

@kubevirt-bot kubevirt-bot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Feb 21, 2024
@remram44
Copy link

I am interested in this feature

/remove-lifecycle stale

@kubevirt-bot
Copy link
Contributor

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

/close

@kubevirt-bot
Copy link
Contributor

@kubevirt-bot: Closed this PR.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/monitoring dco-signoff: yes Indicates the PR's author has DCO signed all their commits. kind/build-change Categorizes PRs as related to changing build files of virt-* components lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/network size/XL
Projects
None yet
Development

Successfully merging this pull request may close these issues.