-
Notifications
You must be signed in to change notification settings - Fork 84
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Expose connection_pool_maxsize on Index and add docstrings (#415)
## Problem To explore the impact on performance, I want to expose a configuration kwarg for `connection_pool_maxsize` on `Index`. ## Solution This `connection_pool_maxsize` value is passed in to `urllib3.PoolManager` as `maxsize`. This param controls how many connections are cached for a given host. If we are using a large number of threads to increase parallelism but this maxsize value is relatively small, we can end up taking unnecessary overhead to establish and discard connections beyond the maxsize that are being cached. By default `connection_pool_maxsize` is set to `multiprocessing.cpu_count() * 5`. In Google colab, cpu count is only 2 so this is fairly limiting. ### Usage ```python from pinecone import Pinecone pc = Pinecone(api_key='key') index = pc.Index( host="jen1024-dojoi3u.svc.apw5-4e34-81fa.pinecone.io", pool_threads=25, connection_pool_maxsize=25 ) ``` ## Type of Change - [x] New feature (non-breaking change which adds functionality) ## Test Plan I ran some local performance tests and saw this does have an impact to performance.
- Loading branch information
Showing
2 changed files
with
53 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters