Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[candi] CSI old mount cleaner #5153

Merged
merged 6 commits into from
Jul 12, 2023
Merged

[candi] CSI old mount cleaner #5153

merged 6 commits into from
Jul 12, 2023

Conversation

zuzzas
Copy link
Contributor

@zuzzas zuzzas commented Jul 7, 2023

Description

Cleanup of old CSI mounts that follow pre Kubernetes 1.24 naming convention. Get deployed only on Kubernetes >=1.24 and only if there are any

Closes #5132

Why do we need it, and what problem does it solve?

It enabled successful Pod volume unmounts when upgrading Kubernetes on the go (without draining a Node).

Closes upstream issue: kubernetes/kubernetes#107065

What is the expected result?

When a kubelet reports that it cannot delete a Pod because there is a mount (with path adhering to an old format), parse a log statement an umount manually.

Checklist

  • The code is covered by unit tests.
  • e2e tests passed.
  • Documentation updated according to the changes.
  • Changes were tested in the Kubernetes cluster manually.

Changelog entries

section: candi
type: fix
summary: Create a `systemd` unit that manually unmounts `pre-1.24` CSI mounts. That should stop stuck Pods while upgrading `kubelet` without draining a node.

Signed-off-by: Andrey Klimentyev <andrey.klimentyev@flant.com>
@zuzzas zuzzas added this to the v1.47.4 milestone Jul 7, 2023
@zuzzas zuzzas self-assigned this Jul 7, 2023
@zuzzas zuzzas added type/bug area/cluster-and-infrastructure Pull requests that update infra modules labels Jul 7, 2023
+
Signed-off-by: Andrey Klimentyev <andrey.klimentyev@flant.com>
@zuzzas zuzzas added the e2e/run/yandex-cloud Run e2e tests in Yandex Cloud label Jul 10, 2023
@deckhouse-BOaTswain
Copy link
Collaborator

deckhouse-BOaTswain commented Jul 10, 2023

🟢 e2e: Yandex.Cloud for deckhouse:mount-faceslapper succeeded in 36m13s.

Workflow details

Yandex.Cloud-WithoutNAT-Containerd-1.23 - Connection string: ssh cloud-user@158.160.103.129

🟢 e2e: Yandex.Cloud, Containerd, Kubernetes 1.23 succeeded in 35m37s.

@github-actions github-actions bot removed the e2e/run/yandex-cloud Run e2e tests in Yandex Cloud label Jul 10, 2023
@zuzzas zuzzas added the e2e/run/yandex-cloud Run e2e tests in Yandex Cloud label Jul 10, 2023
@deckhouse-BOaTswain
Copy link
Collaborator

deckhouse-BOaTswain commented Jul 10, 2023

🔴 e2e: Yandex.Cloud for deckhouse:mount-faceslapper failed in 40m2s.

E2e for yandex-cloud WithoutNAT;containerd;1.23 was failed. Use:
ssh -i ~/.ssh/e2e-id-rsa cloud-user@84.201.156.51 - connect for debugging;

/e2e/abort yandex-cloud;WithoutNAT;containerd;1.23 5505336462 5505336462-1-con-1-23 /sys/deckhouse-oss/install:pr5153 cloud-user@84.201.156.51 - for abort failed cluster

Workflow details (1 job failed)

Yandex.Cloud-WithoutNAT-Containerd-1.23 - Connection string: ssh cloud-user@84.201.156.51

🔴 e2e: Yandex.Cloud, Containerd, Kubernetes 1.23 failed in 39m25s.

@github-actions github-actions bot removed the e2e/run/yandex-cloud Run e2e tests in Yandex Cloud label Jul 10, 2023
@deckhouse-BOaTswain deckhouse-BOaTswain added the e2e/cluster/failed Pull request contains failed e2e cluster label Jul 10, 2023
@zuzzas
Copy link
Contributor Author

zuzzas commented Jul 10, 2023

/e2e/abort yandex-cloud;WithoutNAT;containerd;1.23 5505336462 5505336462-1-con-1-23 /sys/deckhouse-oss/install:pr5153 cloud-user@84.201.156.51

@name212
Copy link
Member

name212 commented Jul 10, 2023

/e2e/abort yandex-cloud;WithoutNAT;containerd;1.23 5505336462 5505336462-1-con-1-23 /sys/deckhouse-oss/install:pr5153 cloud-user@84.201.156.5

@deckhouse-BOaTswain
Copy link
Collaborator

deckhouse-BOaTswain commented Jul 10, 2023

🔴 e2e destroy failed: Yandex.Cloud for refs/heads/main failed in 6m59s.

Workflow details (1 job failed)

🔴 e2e destroy failed: Yandex.Cloud, Containerd, Kubernetes 1.23 failed in 6m6s.

@zuzzas zuzzas added the e2e/run/yandex-cloud Run e2e tests in Yandex Cloud label Jul 10, 2023
@deckhouse-BOaTswain
Copy link
Collaborator

deckhouse-BOaTswain commented Jul 10, 2023

🟢 e2e: Yandex.Cloud for deckhouse:mount-faceslapper succeeded in 34m49s.

Workflow details

Yandex.Cloud-WithoutNAT-Containerd-1.23 - Connection string: ssh cloud-user@158.160.54.165

🟢 e2e: Yandex.Cloud, Containerd, Kubernetes 1.23 succeeded in 34m14s.

@github-actions github-actions bot removed the e2e/run/yandex-cloud Run e2e tests in Yandex Cloud label Jul 10, 2023
@zuzzas zuzzas marked this pull request as ready for review July 10, 2023 14:01
@zuzzas zuzzas requested a review from RomanenkoDenys as a code owner July 10, 2023 14:01
@zuzzas zuzzas added e2e/run/yandex-cloud Run e2e tests in Yandex Cloud e2e/use/k8s/1.25 Use Kubernetes 1.25 for e2e tests and removed e2e/cluster/failed Pull request contains failed e2e cluster labels Jul 11, 2023
@deckhouse-BOaTswain
Copy link
Collaborator

deckhouse-BOaTswain commented Jul 11, 2023

🟢 e2e: Yandex.Cloud for deckhouse:mount-faceslapper succeeded in 33m53s.

Workflow details

Yandex.Cloud-WithoutNAT-Containerd-1.25 - Connection string: ssh cloud-user@158.160.108.131

🟢 e2e: Yandex.Cloud, Containerd, Kubernetes 1.25 succeeded in 33m18s.

@github-actions github-actions bot removed the e2e/run/yandex-cloud Run e2e tests in Yandex Cloud label Jul 11, 2023
@z9r5
Copy link
Member

z9r5 commented Jul 12, 2023

/e2e/abort yandex-cloud;WithoutNAT;containerd;1.23 5505336462 5505336462-1-con-1-23 /sys/deckhouse-oss/install:pr5153 cloud-user@84.201.156.51

@deckhouse-BOaTswain
Copy link
Collaborator

deckhouse-BOaTswain commented Jul 12, 2023

🔴 e2e destroy failed: Yandex.Cloud for refs/heads/main failed in 4m59s.

Workflow details (1 job failed)

🔴 e2e destroy failed: Yandex.Cloud, Containerd, Kubernetes 1.23 failed in 4m37s.

+
Signed-off-by: Andrey Klimentyev <andrey.klimentyev@flant.com>
…d_csi_mounts_after_kubernetes_1_24.sh.tpl

Co-authored-by: Denys Romanenko <65756796+RomanenkoDenys@users.noreply.github.com>
Signed-off-by: Andrey Klimentyev <andrey.klimentyev@flant.com>
@zuzzas zuzzas added the e2e/run/yandex-cloud Run e2e tests in Yandex Cloud label Jul 12, 2023
@deckhouse-BOaTswain
Copy link
Collaborator

deckhouse-BOaTswain commented Jul 12, 2023

🔴 e2e: Yandex.Cloud for deckhouse:mount-faceslapper failed in 40m36s.

E2e for yandex-cloud WithoutNAT;containerd;1.25 was failed. Use:
ssh -i ~/.ssh/e2e-id-rsa cloud-user@84.201.135.116 - connect for debugging;

/e2e/abort yandex-cloud;WithoutNAT;containerd;1.25 5529685365 5529685365-1-con-1-25 /sys/deckhouse-oss/install:pr5153 cloud-user@84.201.135.116 - for abort failed cluster

Workflow details (1 job failed)

Yandex.Cloud-WithoutNAT-Containerd-1.25 - Connection string: ssh cloud-user@84.201.135.116

🔴 e2e: Yandex.Cloud, Containerd, Kubernetes 1.25 failed in 40m3s.

@github-actions github-actions bot removed the e2e/run/yandex-cloud Run e2e tests in Yandex Cloud label Jul 12, 2023
@RomanenkoDenys RomanenkoDenys self-requested a review July 12, 2023 09:08
+
Signed-off-by: Andrey Klimentyev <andrey.klimentyev@flant.com>
@deckhouse-BOaTswain deckhouse-BOaTswain added the e2e/cluster/failed Pull request contains failed e2e cluster label Jul 12, 2023
@z9r5
Copy link
Member

z9r5 commented Jul 12, 2023

/e2e/abort yandex-cloud;WithoutNAT;containerd;1.25 5529685365 5529685365-1-con-1-25 /sys/deckhouse-oss/install:pr5153 cloud-user@84.201.135.116

@z9r5 z9r5 added the e2e/run/yandex-cloud Run e2e tests in Yandex Cloud label Jul 12, 2023
@deckhouse-BOaTswain
Copy link
Collaborator

deckhouse-BOaTswain commented Jul 12, 2023

🟢 e2e destroy failed: Yandex.Cloud for refs/heads/main succeeded in 9m19s.

Workflow details

🟢 e2e destroy failed: Yandex.Cloud, Containerd, Kubernetes 1.25 succeeded in 3m51s.

@deckhouse-BOaTswain
Copy link
Collaborator

deckhouse-BOaTswain commented Jul 12, 2023

🟢 e2e: Yandex.Cloud for deckhouse:mount-faceslapper succeeded in 35m38s.

Workflow details

Yandex.Cloud-WithoutNAT-Containerd-1.25 - Connection string: ssh cloud-user@51.250.74.138

🟢 e2e: Yandex.Cloud, Containerd, Kubernetes 1.25 succeeded in 32m42s.

@github-actions github-actions bot removed the e2e/run/yandex-cloud Run e2e tests in Yandex Cloud label Jul 12, 2023
@deckhouse-BOaTswain deckhouse-BOaTswain removed the e2e/cluster/failed Pull request contains failed e2e cluster label Jul 12, 2023
+
Signed-off-by: Andrey Klimentyev <andrey.klimentyev@flant.com>
Copy link
Member

@RomanenkoDenys RomanenkoDenys left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Member

@RomanenkoDenys RomanenkoDenys left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@RomanenkoDenys RomanenkoDenys merged commit cb7b3eb into main Jul 12, 2023
@RomanenkoDenys RomanenkoDenys deleted the mount-faceslapper branch July 12, 2023 11:47
@z9r5 z9r5 added the status/backport Backport pr label Jul 12, 2023
github-actions bot pushed a commit that referenced this pull request Jul 12, 2023
Signed-off-by: Andrey Klimentyev <andrey.klimentyev@flant.com>
Co-authored-by: Denys Romanenko <65756796+RomanenkoDenys@users.noreply.github.com>
@deckhouse-BOaTswain
Copy link
Collaborator

Cherry pick PR 5180 to the branch release-1.47 successful!

deckhouse-BOaTswain added a commit that referenced this pull request Jul 12, 2023
Signed-off-by: Andrey Klimentyev <andrey.klimentyev@flant.com>
Co-authored-by: Andrey Klimentyev <andrey.klimentyev@flant.com>
Co-authored-by: Denys Romanenko <65756796+RomanenkoDenys@users.noreply.github.com>
@deckhouse-BOaTswain deckhouse-BOaTswain removed the status/backport Backport pr label Jul 12, 2023
@zuzzas zuzzas mentioned this pull request Aug 15, 2023
4 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/cluster-and-infrastructure Pull requests that update infra modules e2e/use/k8s/1.25 Use Kubernetes 1.25 for e2e tests type/bug
Projects
None yet
Development

Successfully merging this pull request may close these issues.

PVs provisioned by all CSI drivers may have problems when upgrading to Kubernetes 1.24
5 participants