Breaking change / performance: don't make kubernetes-client deserialize k8s events into objects #424

rmoe · 2020-08-14T15:38:39Z

Turning API responses into Python objects take a significant
amount of CPU time when dealing with a large number of events.

Fixes: #423

Breaking changes

The Kubernetes EventsReflector, which is providing the KubeSpawner instances with information about Kubernetes Events describing events for other resources, is now exposing events as python dictionaries rather than V1Event objects. V1Event is defined in the kubernetes-client/python library as a representation of a Kubernetes Event.
KubeSpawner's .progress method implementation (Progress on spawn-pending page jupyterhub#1771) which is generating a formatted message as well as a KubeSpawner specific raw_event entry now returns the raw_event as a Python dictionary with entries formatted in camelCase where the keys were formatted in snake_case.

Turning API responses into Python objects take a significant amount of CPU time when dealing with a large number of events. Fixes: jupyterhub#423

welcome · 2020-08-14T15:38:41Z

Thanks for submitting your first pull request! You are awesome! 🤗

If you haven't done so already, check out Jupyter's Code of Conduct. Also, please make sure you followed the pull request template, as this will help us review your contribution more quickly.

You can meet the other Jovyans by joining our Discourse forum. There is also a intro thread there where you can stop by and say Hi! 👋

Welcome to the Jupyter community! 🎉

mriedem

OK mostly some smaller nits and docs related things to update since the values are changing from objects to dicts.

The bigger concern is the impact to out of tree reflectors but I don't have a good grasp on the contract there, or what kind of communication / signalling is appropriate for people, i.e. release notes, forum thread, etc? I think the performance gain is definitely worth the change and (minimal) impact to out of tree reflectors, but it's something that needs to be considered. If we can wrap the dicts in simple objects as a facade maybe that would alleviate some of that concern? It would also actually cut down on the size of this change since you might not need to make any changes to the PodReflector or EventReflector code.

mriedem · 2020-08-14T16:37:31Z

kubespawner/reflector.py

            self.namespace,
            label_selector=self.label_selector,
            field_selector=self.field_selector,
            _request_timeout=self.request_timeout,
        )
        # This is an atomic operation on the dictionary!
-        self.resources = {p.metadata.name: p for p in initial_resources.items}
+        initial_resources = json.loads(initial_resources.read())
+        self.resources = {p["metadata"]["name"]: p for p in initial_resources["items"]}


Technically this changes the value on the resources dict from an object to a dict so the help here should probably be updated as well:

https://github.com/jupyterhub/kubespawner/pull/424/files#diff-2f4d4efedf466e7bf4f8f083d0c722adR52

One thing I'm not sure about is how much this might impact out-of-tree reflectors (assuming those exist, which I don't know if that's supported) that are expecting self.resources to return objects rather than dicts, i.e. this could be a breaking change for them. One could maybe mitigate that by wrapping the dicts in simple objects that pass through __getattr__ to __getitem__ but as we can see below with resource_version we'd also have to camel-case-ify some of the attributes, like this.

I'm not sure what the ABI contract on something like this is within the KubeSpawner. Would a release note be sufficient?

mriedem · 2020-08-14T16:39:03Z

kubespawner/spawner.py

@@ -106,7 +106,7 @@ def events(self):
        #   timestamp without and the other is a higher resolution timestamp.
        return sorted(
            self.resources.values(),
-            key=lambda event: event.last_timestamp or event.event_time,
+            key=lambda event: event["lastTimestamp"] or event["eventTime"],


This is a good example of the kind of issue I'm talking about above. Handling this for the known event and pod reflectors within this repo is sufficient but I worry about breaking out of tree reflectors, but like I said I also don't know what contract is there.

mriedem · 2020-08-14T16:40:30Z

kubespawner/spawner.py

-            all([cs.ready for cs in pod.status.container_statuses])
+            pod["status"]["phase"] == 'Running' and
+            pod["status"]["podIP"] is not None and
+            "deletionTimestamp" not in pod["metadata"] and


Hmm, maybe this should be:

not pod["metadata"].get("deletionTimestamp")

Because deletionTimestamp not being in the dict vs being in the dict with a None value are different things, unless that's not possible with how the kube API works?

deletionTimestamp is only set by the server when a deletion is requested. It shouldn't exist at any other time.

kubespawner/spawner.py

mriedem · 2020-08-14T16:48:45Z

The bigger concern is the impact to out of tree reflectors

By the way, I'm assuming you can have out of tree reflectors but I'm not entirely sure if you can, especially looking at the hard-coding here. The KubeSpawner docs don't mention anything about entry points to load your own reflectors.

I'm just used to lots of things in jupyterhub being extensions so I made an assumption which is looking false.

Also fixed issue with how container statuses were being accessed.

mriedem

Looks good to me now assuming there is no support for out of tree reflectors which it's looking like there isn't. My other comments were addressed. We're already running with this in our pre-production environment because of the performance benefit at high load. Nice work!

kubespawner/reflector.py

consideRatio · 2020-09-02T11:40:02Z

Thank you @rmoe for your work on this PR and @mriedem for your review work! ❤️ 🎉!

Notification / Question

@clkao this PR will impact the users of raw_event returned by progress(). I think the difference will be that you get camelCase instead of snake_case, but still get a python dictionary. @rmoe do you think I described this change correctly?

My conclusion

Overall, I think this PR makes a lot of sense given the excellent report in #423. I think we should accept it, verify the breaking changes and report them in a release which needs to be minor rather than patch.

Breaking changes

This is what I pick up to be breaking changes.

Progress() returned raw_event changed from python dictionary formatted with snake_case to a python dictionary formatted with camelCase (I think).
A spawner's events property populated by the Singleton events reflector now returns python dictionaries instead of V1Event objects from kubernetes-client/python.

mriedem · 2020-09-02T13:34:32Z

FWIW I agree.

Progress() returned raw_event changed from python dictionary formatted with snake_case to a python dictionary formatted with camelCase (I think).

raw_event was a dict because the V1Event object was converted to a dict here. I'm not sure what event.to_dict() does regarding the field names but assuming they were snake-case because that's what they were on the event object and now they are camel-case.

A spawner's events property populated by the Singleton events reflector now returns python dictionaries instead of V1Event objects from kubernetes-client/python.

Yup, the docstring for the events property function was updated to reflect that here.

I mentioned that we could try to still return an object with some massaging of the field names but that could get complicated. We'd probably know if that worked if we didn't have to make some of the changes in this patch, but it's hard to say for sure since there are no tests (that I could find) for the EventReflector code.

Co-authored-by: Erik Sundell <erik.i.sundell@gmail.com>

rmoe · 2020-09-02T19:46:09Z

@clkao this PR will impact the users of raw_event returned by progress(). I think the difference will be that you get camelCase instead of snake_case, but still get a python dictionary. @rmoe do you think I described this change correctly?

That's correct. They were snake case because of how those OpenAPI objects got converted to dictionaries. Now we just have the response from the Kubernetes API as a dictionary and everything the API returns is camel cased.

consideRatio · 2020-09-02T20:26:44Z

@rmoe thank you soo much for your thorough investigation and this PR to resolve the found performance issue! Thank you @mriedem for your review work! ❤️ 💯 🎉

@jupyterhub/jupyterhubteam this LGTM but it would be nice to get another LGTM before I press merge. It solves a real problem when scaling to a very large amount of users so it motivates some breaking changes I think.

consideRatio

Excellent work! ❤️

welcome · 2020-09-02T20:54:07Z

Congrats on your first merged pull request in this project! 🎉

Thank you for contributing, we are very proud of you! ❤️

yuvipanda · 2020-09-02T20:54:41Z

\o/ Thanks a lot for this, @rmoe and @mriedem!

And thanks for your review, @consideRatio :)

yuvipanda · 2020-09-02T20:55:03Z

I was just dealing with an outage caused by slow response times cascading because of high CPU usage... :)

Brings in jupyterhub/kubespawner#424

Primarily to bring in jupyterhub/zero-to-jupyterhub-k8s#1768, which brings in jupyterhub/kubespawner#424 for performance Ref #1746

Don't deserialize Kubernetes API responses

8fb6e5f

Turning API responses into Python objects take a significant amount of CPU time when dealing with a large number of events. Fixes: jupyterhub#423

mriedem suggested changes Aug 14, 2020

View reviewed changes

Update comments to reflect changed data structures

cc1fa8a

Also fixed issue with how container statuses were being accessed.

betatim requested review from consideRatio and minrk August 15, 2020 06:19

mriedem approved these changes Aug 19, 2020

View reviewed changes

consideRatio reviewed Sep 2, 2020

View reviewed changes

kubespawner/reflector.py Outdated Show resolved Hide resolved

consideRatio changed the title ~~Don't deserialize Kubernetes API responses~~ Performance: don't make kubernetes-client deserialize k8s events into objects Sep 2, 2020

Update kubespawner/reflector.py

680a22c

Co-authored-by: Erik Sundell <erik.i.sundell@gmail.com>

consideRatio changed the title ~~Performance: don't make kubernetes-client deserialize k8s events into objects~~ Breaking change / performance: don't make kubernetes-client deserialize k8s events into objects Sep 2, 2020

consideRatio approved these changes Sep 2, 2020

View reviewed changes

consideRatio added enhancement bug labels Sep 2, 2020

yuvipanda merged commit b209ce1 into jupyterhub:master Sep 2, 2020

yuvipanda added a commit to yuvipanda/datahub-old-fork that referenced this pull request Sep 2, 2020

hub: Build kubespawner from master

c96d933

Brings in jupyterhub/kubespawner#424

yuvipanda added a commit to yuvipanda/datahub-old-fork that referenced this pull request Sep 2, 2020

Bump to master of kubespawner

0094788

Brings in jupyterhub/kubespawner#424

yuvipanda mentioned this pull request Sep 2, 2020

Bring in newer kubespawner berkeley-dsep-infra/datahub#1777

Closed

yuvipanda added a commit to yuvipanda/datahub-old-fork that referenced this pull request Sep 4, 2020

Bump version of z2jh chart to latest

725583e

Primarily to bring in jupyterhub/zero-to-jupyterhub-k8s#1768, which brings in jupyterhub/kubespawner#424 for performance Ref #1746

yuvipanda mentioned this pull request Sep 4, 2020

Bump version of z2jh chart to latest berkeley-dsep-infra/datahub#1793

Merged

consideRatio mentioned this pull request Sep 8, 2020

Fix KubeIngressProxy.get_all_routes for 0.13 #430

Merged

minrk mentioned this pull request Nov 2, 2020

pin kubernetes, jupyterhub in requirements.in jupyterhub/binderhub#1190

Merged

minrk mentioned this pull request Nov 2, 2020

30 Oct - merged and reverted pr - did the merge cause a disruption? jupyterhub/mybinder.org-deploy#1687

Closed

This was referenced Jan 28, 2022

watches still deserialize with _preload_content=False tomplus/kubernetes_asyncio#176

Closed

Rely on the event loop: use kubernetes_asyncio instead of kubernetes and dedicated threads #563

Merged

dolfinus mentioned this pull request May 17, 2023

reflector:_list_and_update check k8s api response error before read the items #722

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Breaking change / performance: don't make kubernetes-client deserialize k8s events into objects #424

Breaking change / performance: don't make kubernetes-client deserialize k8s events into objects #424

rmoe commented Aug 14, 2020 •

edited by consideRatio

Loading

welcome bot commented Aug 14, 2020

mriedem left a comment

mriedem Aug 14, 2020 •

edited

Loading

mriedem Aug 14, 2020

mriedem Aug 14, 2020

rmoe Aug 14, 2020

mriedem commented Aug 14, 2020 •

edited

Loading

mriedem left a comment

consideRatio commented Sep 2, 2020 •

edited

Loading

mriedem commented Sep 2, 2020

rmoe commented Sep 2, 2020

consideRatio commented Sep 2, 2020

consideRatio left a comment

welcome bot commented Sep 2, 2020

yuvipanda commented Sep 2, 2020

yuvipanda commented Sep 2, 2020

Breaking change / performance: don't make kubernetes-client deserialize k8s events into objects #424

Breaking change / performance: don't make kubernetes-client deserialize k8s events into objects #424

Conversation

rmoe commented Aug 14, 2020 • edited by consideRatio Loading

Breaking changes

welcome bot commented Aug 14, 2020

mriedem left a comment

Choose a reason for hiding this comment

mriedem Aug 14, 2020 • edited Loading

Choose a reason for hiding this comment

mriedem Aug 14, 2020

Choose a reason for hiding this comment

mriedem Aug 14, 2020

Choose a reason for hiding this comment

rmoe Aug 14, 2020

Choose a reason for hiding this comment

mriedem commented Aug 14, 2020 • edited Loading

mriedem left a comment

Choose a reason for hiding this comment

consideRatio commented Sep 2, 2020 • edited Loading

Notification / Question

My conclusion

Breaking changes

mriedem commented Sep 2, 2020

rmoe commented Sep 2, 2020

consideRatio commented Sep 2, 2020

consideRatio left a comment

Choose a reason for hiding this comment

welcome bot commented Sep 2, 2020

yuvipanda commented Sep 2, 2020

yuvipanda commented Sep 2, 2020

rmoe commented Aug 14, 2020 •

edited by consideRatio

Loading

mriedem Aug 14, 2020 •

edited

Loading

mriedem commented Aug 14, 2020 •

edited

Loading

consideRatio commented Sep 2, 2020 •

edited

Loading