Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid emitting warnings/errors during normal operation #578

Closed
consideRatio opened this issue Feb 24, 2022 · 5 comments · Fixed by #579
Closed

Avoid emitting warnings/errors during normal operation #578

consideRatio opened this issue Feb 24, 2022 · 5 comments · Fixed by #579
Labels

Comments

@consideRatio
Copy link
Member

consideRatio commented Feb 24, 2022

Background

@athornton reports there are warnings/errors shown when using KubeSpawner that clutters logs. This issue is meant to help us dig into the details about that. My goal is to resolve it in a way within KubeSpawner that doesn't make us: risk suppressing a relevant warning/error, confuse an end user with irrelevant warnings/errors, maintain a fragile patch making changes to aiohttp.

Technical details

We have switched the library we use to communicate with the k8s api-server to using kubernetes_asyncio from kubernetes. When kubernetes_asyncio makes web requests, it uses aiohttp.

Open questions about the warnings/errors

  • When do we see these errors when using KubeSpawner?
    • Is it only when we shut down a JupyterHub process?
    • Is it when we a user server?
    • Is it at all seemingly random times?
  • How do we see these errors when using KubeSpawner?
    • Is it via logs from the JupyterHub process?
    • How exactly does the logs look as reported from there?

Open technical questions

  • What options do we have?
    1. Patching aiohttp
    2. ???

Related

@minrk
Copy link
Member

minrk commented Feb 24, 2022

I dug into this a little bit, and it's because we allow our shared clients to be garbage collected by using weakrefs. If we held onto our shared_client objects and cleaned them up atexit instead, the warning appears to go away.

that raises errors (unsure about this).

It doesn't raise errors (for us or by default). The loop's exception handler is the (overrideable) callback for unhandled errors in asyncio that don't have a coroutine/future to attach errors to. It can be used to halt the loop if you want all errors to be fatal for some reason, but the default and most common behavior is to log and move on, which I think is why it's used here.

The feature that's really missing is an "atexit, but for asyncio" which is called when the event loop is closed. This doesn't exist, and there's no reliable way to register things (especially coroutines) to call at shutdown of asyncio while the eventloop still exists.

@consideRatio
Copy link
Member Author

consideRatio commented Feb 24, 2022

I dug into this a little bit, and it's because we allow our shared clients to be garbage collected by using weakrefs. If we held onto our shared_client objects and cleaned them up atexit instead, the warning appears to go away.

Wieeeee I'm excited that you have a quite clear idea about the situation, I don't yet get it though!

  1. clients.py defines _client_cache = {} where references are stored via weakref.ref(client) where the client instance was created in a function.
  2. KubeSpawner instances are created over time that make use of shared clients...
  3. KubeSpawner instances are garbage collected over time (I assume!)

Hmm...

  • JupyterHub terminates at some point
  • atexit registered functions are called at some point
  • the ClientSession aiohttp instances part of the shared clients gets called at some point, at which we get warnings/errors as we didn't terminate our clients properly first (??)

Hmmmmmm...

  • At this point, I don't know the purpose of having a weakref wrap of the shared_client. Learning more about weakref again now.

@manics
Copy link
Member

manics commented Feb 24, 2022

I share @consideRatio's concerns about patching aiohttp, especially since it affects other JupyterHub components or extensions that might use aiohttp. Debugging async problems is already difficult enough!

Is it possible to come up with a test case we can add to https://github.com/jupyterhub/kubespawner/tree/main/tests so it's easy to reproduce?

@minrk
Copy link
Member

minrk commented Feb 24, 2022

#579 includes tests, but it patches asyncio instead of aiohttp! It patches it to do something good instead of skipping cleanup, though.

@athornton
Copy link
Contributor

#579 seems like a good thing. I'll withdraw the monkeypatch from the z2jh PR (jupyterhub/zero-to-jupyterhub-k8s#2594) and adopt #579 with a release that incorporates it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants