Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[PR] Prevent thread leakage on exiting #326

Closed
3 of 5 tasks
kopf-archiver bot opened this issue Aug 18, 2020 · 0 comments
Closed
3 of 5 tasks

[PR] Prevent thread leakage on exiting #326

kopf-archiver bot opened this issue Aug 18, 2020 · 0 comments
Labels
archive bug Something isn't working enhancement New feature or request refactoring Code cleanup without new features added

Comments

@kopf-archiver
Copy link

kopf-archiver bot commented Aug 18, 2020

A pull request by nolar at 2020-03-09 18:39:52+00:00
Original URL: zalando-incubator/kopf#326
Merged by nolar at 2020-03-10 14:58:59+00:00

What do these changes do?

Prevent leakage of OS resources (threads) on operator existing, which can make it irresponsive after the main function is finished and no logging is happening (if a sync-handler hangs).

Description

All sync-handlers (declared with def, not async def) are executed in thread pools. The operator itself is fully asynchronous. Whenever operator is exiting or is being cancelled, it also cancels all its coroutines. However, if a coroutine is cancelled at loop.run_in_executor(), the coroutine indeed exits, but the thread it used continues running — until finished. As a result, some threads can continue running even after the operator is kind-of-exited normally.

This becomes critical as long-running handlers are coming into play (i.e. daemons). But it is a change of its own value even now — hence, it is extracted for easier reviewing.

Instead, with this PR, the operator will wait for such threads to finish (potentially forever or until SIGKILL'ed, as Kubernetes does after 30s). And while waiting, it will log the status of the operator's tasks and what exactly it waits for. So, it becomes visible in the logs which handler hangs or prevents operator from exiting properly. In case everything exits instantly, there is nothing to wait, nothing to log, so it exits as usually.

In addition, the lifecycle functions (which select the handlers to be processed) are now called directly to avoid using thread pools. We know in advance that the lifecycle functions are safe and fast (instant), so there is no need to overcomplicate them.

Issues/PRs

Issues: #19

Type of changes

  • New feature (non-breaking change which adds functionality)
  • Bug fix (non-breaking change which fixes an issue)
  • Refactoring (non-breaking change which does not alter the behaviour)

Checklist

  • The code addresses only the mentioned problem, and this problem only
  • I think the code is well written
  • Unit tests for the changes exist
  • Documentation reflects the changes
  • If you provide code modification, please add yourself to CONTRIBUTORS.txt
@kopf-archiver kopf-archiver bot closed this as completed Aug 18, 2020
@kopf-archiver kopf-archiver bot changed the title [archival placeholder] [PR] Prevent thread leakage on exiting Aug 19, 2020
@kopf-archiver kopf-archiver bot added bug Something isn't working enhancement New feature or request refactoring Code cleanup without new features added labels Aug 19, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
archive bug Something isn't working enhancement New feature or request refactoring Code cleanup without new features added
Projects
None yet
Development

No branches or pull requests

0 participants