Branched discussion from kubernetes_asyncio switch - is ThreadPool workaround needed? #567

consideRatio · 2022-02-17T16:38:46Z

TL;DR

I considered if a workaround is still needed, also after kubernetes_asyncio is used to replace kubernetes in #563, and concluded yes it probably is - but it isn't if a property is never read - but that's hard to be confident about.

Background

I'm reviewing #563 that replaces kubernetes with kubernetes_asyncio. Doing that, I'm considering if we should remove a workaround or not.

Regarding this workaround

This is a workaround introduced in #169.

kubespawner/kubespawner/clients.py

Lines 12 to 19 in a669816

    
           # FIXME: remove when instantiating a kubernetes client 
        
           # doesn't create N-CPUs threads unconditionally. 
        
           # monkeypatch threadpool in kubernetes api_client 
        
           # to avoid instantiating ThreadPools. 
        
           # This is known to work for kubernetes-4.0 
        
           # and may need updating with later kubernetes clients 
        
           _dummy_pool = Mock() 
        
           api_client.ThreadPool = lambda *args, **kwargs: _dummy_pool

Do we still need it?

A ThreadPool object was in the past initialized by the kubernetes client without a specified number of threads to be used, which led to a number of threads matching the number of CPUs. Since kubernetes>=9 they initialize the ThreadPool object with 1 thread by default.

Does this make us no longer need the patch, or is it problematic that a ThreadPool object is instantiated at all? I think so, but does this change with switching to kubernetes_asyncio?

How does the hack work?

kubespawner/kubespawner/clients.py

Lines 18 to 19 in a669816

    
           _dummy_pool = Mock() 
        
           api_client.ThreadPool = lambda *args, **kwargs: _dummy_pool

This modifies the following code:

    @property
    def pool(self):
        """Create thread pool on first request
         avoids instantiating unused threadpool for blocking clients.
        """
        if self._pool is None:
            atexit.register(self.close)
            self._pool = ThreadPool(self.pool_threads)
        return self._pool

From this, I presume that when self.pool is referenced, a Mock() instance is returned, which just happily accepts everything but does nothing.

Will `kubernetes_asyncio` use the `pool`?

Depends, is async_req passed when ApiClient instances calls call_api?

    def call_api(self, resource_path, ...
        if not async_req:
            return self.__call_api(resource_path, ...
        return self.pool.apply_async(self.__call_api, (resource_path, ...

Initial conclusion

I think neither kubernetes or kubernetes_asyncion makes api requests by passing async_req parameters. But I note there is a difference between a __call_api function.

kubernetes defines __call_api without async.
kubernetes_asyncio defines __call_api with async.

Overall, I conclude we shouldn't have a threadpool, and Mocking it to never initialize a ThreadPool with one thread is fine - because it will never be referenced no matter what.

At the same time, I wonder if a ThreadPool would ever be initialized? What if the pool property isn't accessed at any time? If it isn't, then it won't try create a ThreadPool at all, and it doesn't look to be read in api_client.py for example - if it were with our mock - I think it would error anyhow.

The text was updated successfully, but these errors were encountered:

athornton · 2022-02-17T17:00:03Z

I came to basically the same conclusion. I don't think we need it anymore, but it's super-lightweight and it didn't feel like there was any harm in leaving it around. We should not need a threadpool since we're no longer using Threading at all.

consideRatio added the maintenance label Feb 17, 2022

consideRatio closed this as completed Feb 17, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Branched discussion from kubernetes_asyncio switch - is ThreadPool workaround needed? #567

Branched discussion from kubernetes_asyncio switch - is ThreadPool workaround needed? #567

consideRatio commented Feb 17, 2022 •

edited

Loading

athornton commented Feb 17, 2022

Branched discussion from kubernetes_asyncio switch - is ThreadPool workaround needed? #567

Branched discussion from kubernetes_asyncio switch - is ThreadPool workaround needed? #567

Comments

consideRatio commented Feb 17, 2022 • edited Loading

TL;DR

Background

Regarding this workaround

Do we still need it?

How does the hack work?

Will kubernetes_asyncio use the pool?

Initial conclusion

athornton commented Feb 17, 2022

consideRatio commented Feb 17, 2022 •

edited

Loading

Will `kubernetes_asyncio` use the `pool`?