Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support caching of API call response based on set time expiry as part of boto3.session.Session #2723

Closed
rams3sh opened this issue Jan 13, 2021 · 4 comments
Labels
feature-request This issue requests a feature.

Comments

@rams3sh
Copy link

rams3sh commented Jan 13, 2021

Is your feature request related to a problem? Please describe.

I have a scheduled trigger based serverless app running from a central account running on lambda(s) which has to assume role across around 1k accounts and carry out certain actions. Each task within the app runs as separate process and may have to repeat some boto3 api calls within same account. This results in same api calls being called multiple times resulting in IO wait / throttling (which involves sleeping to initiate the next call) increasing the execution time which does sometime lead to lambda timeouts. Writing a function cache wrapper for each boto3 method didn't feel as a scalable solution for me.

This feature request proposes to have a setting where one could set a cache directory with cache expiry timeout. Boto3 package can use that path to store all cached response (something like picked files). Whenever an api call is initiated through a session object , the cache will be checked and the response will be returned if the last cached response is within the cache expiry from the cache file.

The scope of cache would be limited only to that given session.

Describe the solution you'd like
A sample code snippet to give idea about the expected solution is given below :-

cache_enabled_session = boto3.session.Session(cache_path="/.cache/", cached_response_expiry=15)  # 15 minutes
iam_client = cache_enabled_session.client("iam")
users_from_api_call = iam_client.list_users()
users_from_cached_response = iam_client.list_users()

s3 _client  = cache_enabled_session.client("s3", region_name="eu-west-1")
buckets_from_api_call = s3_client.list_buckets()
buckets_from_cached_response = s3_client.list_buckets()

# After 13 minutes , if another session is initiated with same params and same cache, it should fallback to previous session i.e. cache_enabled_session
time.sleep(13*60)
cache_enabled_session_2 = boto3.session.Session(cache_path="/.cache/", cached_response_expiry=15)
iam_client = cache_enabled_session_2.client("iam")
users_from_cached_response_2 = iam_client.list_users() # Response will be fetched from same cache

# The below session would not have any access to that cache and would not use cached response.

cache_disabled_session = boto3.session.Session()
users_from_api_call = iam_client.list_users()
@rams3sh rams3sh added feature-request This issue requests a feature. needs-triage This issue or PR still needs to be triaged. labels Jan 13, 2021
@swetashre swetashre removed the needs-triage This issue or PR still needs to be triaged. label Jan 21, 2021
@rams3sh
Copy link
Author

rams3sh commented Feb 14, 2021

Until this issue is triaged / accepted for development. I found an interim solution using python unittest.mock 's patch.

Below a sample snippet. I have used memoization, but however one may use other custom caching mechanisms for the same.

from unittest.mock import patch
from boto3.session import Session
from botocore.client import BaseClient
import memoization
import time


class ResponseCachingProxyClient(BaseClient):

    def __init__(self, cached_enabled=True, *args, **kwargs, ):
        super().__init__(*args, **kwargs)
        self.cache_enabled = cached_enabled

    def _make_api_call(self, *args, **kwargs):
        action = args[0]
        call_verbs_to_cache = ["List", "Get", "Describe"]
        if self.cache_enabled and any([action.startswith(call) for call in call_verbs_to_cache]):
            return self._make_cached_api_call(*args, **kwargs)
        return super()._make_api_call(*args, **kwargs)

    @memoization.cached(ttl=900, thread_safe=True, order_independent=True)
    def _make_cached_api_call(self, *args, **kwargs):
        return super()._make_api_call(*args, **kwargs)


with patch('botocore.client.BaseClient', new=ResponseCachingProxyClient):
    session = Session()
    client = session.client('iam')
    paginator = client.get_paginator('list_users')
    start = time.time()
    for page in paginator.paginate():
        print(page['ResponseMetadata']['HTTPHeaders']['x-amzn-requestid']) # Request Id to track if cache is being used
    print("Time taken before caching:", time.time()-start)
    start = time.time()
    for page in paginator.paginate():
        print(page['ResponseMetadata']['HTTPHeaders']['x-amzn-requestid']) # Request Id to track if cache is being used
    print("Time taken after caching:", time.time()-start)


   # Futher Business logic ..... 

Thought of posting it here, just in case if anyone else is looking for an interim solution for this problem.

@rams3sh
Copy link
Author

rams3sh commented Feb 19, 2021

I created a new project to solve this issue called botocache.
For other's having a similar use case can check out the project here.

@kdaily
Copy link
Member

kdaily commented Apr 6, 2021

Hi all, see the comment regarding closure of this issue here:

Thank you for your post @rams3sh. I would also like to point you to an external tool for application side caching: cachetools. There are interfaces within this lib that may support your use case. For the time being I will close this issue (as well as #2723). If we decide to take further action, will reopen and update.

Originally posted by @zdutta in boto/botocore#2296 (comment)

@kdaily kdaily closed this as completed Apr 6, 2021
@github-actions
Copy link

github-actions bot commented Apr 6, 2021

⚠️COMMENT VISIBILITY WARNING⚠️

Comments on closed issues are hard for our team to see.
If you need more assistance, please either tag a team member or open a new issue that references this one.
If you wish to keep having a conversation with other community members under this issue feel free to do so.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature-request This issue requests a feature.
Projects
None yet
Development

No branches or pull requests

3 participants