-
Notifications
You must be signed in to change notification settings - Fork 152
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Failed tests leak CSI resources #260
Comments
/kind bug |
@msau42: Please ensure the request meets the requirements listed here. If this request no longer meets these requirements, the label can be removed In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
@pohly you mentioned on Slack that there is some kind of cleanup taking place already. Looking forward to hearing about it. |
That code is in https://github.com/kubernetes-csi/csi-test/blob/master/pkg/sanity/cleanup.go Whenever you see a call to |
Have you encountered a specific test where the cleanup didn't work? |
@pohly I had experienced the problem when I was running What am I missing? Is this possibly as mock driver vs. hostpath driver issue? |
Yes. Looks like those tests were written without taking into account how cleanup is supposed to work. So this issue is valid, we just need to be more specific about which tests have to be updated. |
@pohly at a quick glance, it looks like all Other tests may be affected as well. I spotted the AFAICS, there are two TODOs:
Going further, I wonder if we should extend the (test) API so that test authors are less susceptible to missing resource registrations. For instance, we could provide a convenience function that creates a resource and registers it for cleanup in one go. This may also allow us to get rid of Feel free to assign me to this one. I should have some bandwidth by the end of this week to work on whatever we determine is the most desirable approach, and update all volume/snapshot usages in the test code systematically. |
That sounds useful.
Thanks! I'm not sure whether I can assign it to you, but we'll see: /assign @timoreimann |
Running into a similar issue: I have a CSI plugin that's far from complete/spec-compliant, so many tests like Now, when such test fails, it appears |
@NicolasT: if the default implementation for directory creation/deletion doesn't work for you, then you can also use scripts. For example, the sanity command runs unprivileged and thus cannot clean up after a privileged CSI driver mounted something inside the directory. But a custom script could use "sudo" to clean up. Also, in PMEM-CSI, we are using a create script for the directories which uses mktemp to create unique directories for each test: https://github.com/intel/pmem-csi/blob/dfdd0a0fb3a3e957ad91ebdd5d93f6d64caea77f/test/e2e/storage/sanity.go#L171-L177 That still leaves garbage behind after one test failure, but at least the others can still run. |
Issues go stale after 90d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
/remove-lifecycle stale |
90efb2c Merge pull request kubernetes-csi#272 from andyzhangx/patch-3 9b616fe Bump golang to 1.23.6 to fix CVE-2024-45336, CVE-2025-22866 0496593 Merge pull request kubernetes-csi#268 from huww98/cloudbuild 119aee1 Merge pull request kubernetes-csi#266 from jsafrane/bump-sanity-5.3.1 0ae5e52 Update cloudbuild image with go 1.21+ 406a79a Merge pull request kubernetes-csi#267 from huww98/gomodcache 9cec273 Set GOMODCACHE to avoid re-download toolchain 98f2307 Merge pull request kubernetes-csi#260 from TerryHowe/update-csi-driver-version e9d8712 Merge pull request kubernetes-csi#259 from stmcginnis/deprecated-kind-kube-root faf79ff Remove --kube-root deprecated kind argument 43bde06 Bump csi-sanity to 5.3.1 18b6ac6 chore: update CSI driver version to 1.15 git-subtree-dir: release-tools git-subtree-split: 90efb2ca59900f19eba05e65da28beda79c5bb28
The vast majority (all?) of the tests in csi-test do not seem to clean up resources properly when an assertion failure occurs. I'm not too familiar with Gomega, but my understanding is that once an
EXPECT
fails, the test execution comes to a halt immediately, so none of the cleanup logic usually done by the end of a test (e.g., deleting volumes or snapshots) gets a chance to execute.This is a problem because the created CSI resources leaking this way may affect other tests to the extent that they may now fail as well even though they otherwise wouldn't. Specifically, a test that does not create any fixtures would now have left-over objects from the failed tests and thus fail too, aggravating identification of why a test run fails.
It'd be good to come up with a means to reliably clean up resources regardless of what the outcome of a test (and any of its assertions) is.
The text was updated successfully, but these errors were encountered: