Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Instance list cache #33

Open
wants to merge 9 commits into
base: master
Choose a base branch
from

Conversation

spanagiot
Copy link

The ssm resolver as it is implemented now needs to fetch all the instances from AWS in every connection attempt. I thought it would faster to store the response in a temporary location for caching to speed up consecutive requests.

The cache lasts for one day (RESOLVER_CACHE_DURATION) and there is a flag --update-cache that can be used to trigger a cache update.

@mludvig
Copy link
Owner

mludvig commented Sep 16, 2022

Hi, thanks for the PR. I don't want this to be the default behaviour, as it could be very confusing for users. If anything the cache should be an explicit argument (e.g. --use-cache, --update-cache and --cache-file) so that only people who need it can use it. Also the cache should be stored in the user local cache dir, e.g. ~/.cache/ssm-session. I believe that there's a library to figure out the actual location on different systems.

Also please check the latest release from the master branch to see if the speed has improved for you. We now no longer fetch all the records but instead only the active ones, it may help.

@spanagiot
Copy link
Author

Thank you for your feedback.

The reason I didn't use the ~/.cache location was the extra library that needed and I didn't know if that was ok. I'll add it and fix that.

I'll merge your changes to my branch and update the cache behaviour.

@mludvig
Copy link
Owner

mludvig commented Sep 16, 2022

Hi again, one more thing - the cache should probably be aware of the account id and region. It makes sense to cache the records for each acct-id+region pair.

@spanagiot
Copy link
Author

I made all the recommended changes, seems to work as requested now.

Copy link
Owner

@mludvig mludvig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the patch. I have added some comments, nothing major. The support should also be added to ec2-ssh and to ssm-tunnel - it's using the same reolver, just the argument parsing and the logic around allowed argument combination should be added. Perhaps externalise the arguments to ssm_tools/common.py and create add_cache_parameters() function?

Also we now have type hints and black formatting, you can use ./check.sh to make sure that all the linting checks pass.

@@ -18,21 +21,84 @@ def __init__(self, args):
# Construct boto3 session with MFA cache
self.session = boto3.session.Session(profile_name=args.profile, region_name=args.region)
self.session._session.get_component('credential_provider').get_provider('assume-role').cache = botocore.credentials.JSONFileCache(cli_cache)

self.account_region = self.session.region_name
self.account_id = self.session.resource('iam').CurrentUser().user_id
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be session.client('sts').get_caller_identity()['Account'] to make it work with Users as well as EC2 IAM Roles.

@@ -18,21 +21,84 @@ def __init__(self, args):
# Construct boto3 session with MFA cache
self.session = boto3.session.Session(profile_name=args.profile, region_name=args.region)
self.session._session.get_component('credential_provider').get_provider('assume-role').cache = botocore.credentials.JSONFileCache(cli_cache)

self.account_region = self.session.region_name
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if region_name is not set, e.g. when using EC2 IAM Role?


if self.use_cache:
if self.cache_file is None:
self.instance_cache = os.path.join(self.cache_dir, f'ssm_instances_{self.account_id}_{self.account_region}')
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better to use pathlib.Path

self.instance_cache = self.cache_file

def get_list_cache(self):
if not os.path.exists(self.instance_cache):
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto.


if time.time() - os.path.getmtime(self.instance_cache) > self.RESOLVER_CACHE_DURATION:
logger.debug('Did not load cache because file is older that one day')
return None
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, use pathlib. Also should the cache file be deleted in this case?

if bool(args.INSTANCE) + bool(args.list) != 1:
# Require exactly one of INSTANCE or --list and allow to do only cache update
if bool(args.INSTANCE) and bool(args.list) or \
not args.update_cache and not bool(args.INSTANCE) and not bool(args.list):
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about:

if bool(args.INSTANCE) + bool(args.list) + bool(args.update_cache) != 1:
    parser.error("Specify either INSTANCE or --list or --update-cache")

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants