Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove AWS SDK from Lambda Layer to reduce package size by 90.5% #1164

Closed
mccauleyp opened this issue Nov 23, 2021 · 21 comments
Closed

Remove AWS SDK from Lambda Layer to reduce package size by 90.5% #1164

mccauleyp opened this issue Nov 23, 2021 · 21 comments
Assignees
Labels
feature-request feature request

Comments

@mccauleyp
Copy link

Is your feature request related to a problem? Please describe.
In trying to upgrade to v1.22.0, I ran into the deployment error Layers consume more than the available size of 262144000 bytes. This is for a Lambda function that uses both the Lambda Powertools Layer and the latest AWS Data Wrangler Layer. I think the new Powertools layer is only a little larger than the previous version but enough to tip over the limit when combined with the AWS Data Wrangler layer.

I pulled down the Powertools zip contents following the "Get the Layer .zip contents" instructions, and it looks like most of the size (~70 MB unzipped) comes from botocore (~63 MB).

Is it necessary to include botocore and boto3 in the layer's zip file given that these are available by default in the Lambda runtime? If not, it would be helpful to remove them from the pre-built layer to help avoid hitting the overall size limit.

Describe the solution you'd like
Remove botocore and boto3 from the Lambda layer zip archive.

Describe alternatives you've considered
I could build my own layer that doesn't include botocore and boto3. I tried this already by downloading the existing layer .zip, deleting those packages, zipping it back up and deploying it. That seems to work in my application, but it would be great if creating my own layer wasn't necessary.

Additional context
You could replicate the error by trying to deploy a function with both the pre-built AWS Lambda Powertools and AWS Data Wrangler layers included. The code size coming from my application is negligible compared to the layer sizes.

P.S. Thanks for the great resource! This has been a very helpful package for me and my team :)

@heitorlessa heitorlessa changed the title Reduce size of Lambda layer .zip file by removing dependencies already in Lambda runtime Reduce size of Lambda layer .zip file by removing botocore already in Lambda runtime Nov 23, 2021
@heitorlessa
Copy link
Contributor

hey @mccauleyp - Thanks a lot creating this. Could you do a +1 for this feature request on Boto Modularization to help us solve this properly?

We've been thinking of two strategies here but haven't had a chance to write it extensively, so I'll write from the top of my head so I can expand them later :):

  1. Cut a 2.0 with optional dependencies only. For those using pip, Powertools would be 1.6M. However, there are subtle challenges beyond our control, for example a customer would need to bring AWS X-Ray SDK to use Tracer, which depends on Boto (65M) leading to ~68M extra (boto3+botocore+wrapt+small stuff) either way. If you use Tracer in Powertools 2.0 + Data Wrangler, you're gonna hit this issue again.

  2. Collect customer names to make a case for Lambda Powertools and/or others to be embedded at runtime. Sooner or later you will hit a package size limit, it's a matter of when - adding numpy (~90M), arrow (~62M), pandas (~50M), plus depending on any non-modularized SDK (~65M) will eventually hit a limit as these packages improve and support older Python versions (future (3M)).

I initially thought 1 was a great option over something like implicit namespaces, but then it struck me that a Lambda Layer/SAR App would still get to the same end result. Optimizing Lambda Layer/SAR App could eventually lead to known unknowns - e.g., boto X supports a feature we're using and it's not available at Lambda Runtime, or Data Wrangler also depending on boto.

This led me to think, maybe we could add multiple SAR Apps as it's easier for us to create & maintain than public Layers, but the closer I look the more I see them as workarounds really.

Maybe I'm thinking this the wrong way and there's a better solution to this - I'd appreciate any idea anyone has on this topic.

In the meantime, I pinged the Data Wrangler team to have a chat around this topic, and will ping the Lambda team on whether 2 could be a reality if we have enough customers asking for it.

cc @cakepietoast @am29d

@am29d
Copy link
Contributor

am29d commented Nov 23, 2021

hey @mccauleyp - Thanks again for creating the issue. I have been pondering on this question when I worked on the layer. The biggest concern I had was the dependencies fragmentation, which is also in the nature of the layer functionality itself. If we remove the botocore (it's an easy fix) it will result in a) break existing code, b) we need to provide exact version match between botocore version and powertools (i.e. 1.22.0 works with <1.23.x and >=1.18.x), c) keep it documented well enough so developer can spot missing dependencies or version mismatch early. It might sound like we are reinventing a dependency management solution here.

One short-term solution I see is to verify that powertools works with the botocore version shipped in the Lambda runtime and remove our shipped botocore package. This would couple the powertools layer to lambda runtime, and we need to verify what impact it might have for different powertools utilities.

I see option 2 mentioned by Heitor as an ideal solution to have powertools in the embedded runtime.

@heitorlessa
Copy link
Contributor

Spoke with the Data Wrangler team and they are removing boto3/botocore before building the Lambda Layer along with other data to fit within the Lambda code artifact limit - however this is not being tested so boto/botocore changes or new features might not work when using Lambda Layers.

I went ahead and did a quick script[1] to check the size impact on removing boto, and if we were to cut a 2.0 with optional dependencies only as it's already planned:

  • today with boto3: 72M
  • removing boto: 6.7M
  • removing boto and X-Ray SDK (V2 only): 1.8M
  • Powertools only (V2 only): 1.6M

If we were to do a post-build optimization before publishing a Lambda Layer/SAR App, we would need a mechanism to be certain this wouldn't impact existing customers, otherwise we would need a major version -- Integration testing running within Lambda runtime could give us that safety net.


[1] Shell script to test size impact

#!/bin/bash

declare -a BOTO_DEPS
declare -a XRAY_DEPS
declare -a POWERTOOLS_DEPS
DEST_DIR="/tmp/powertools-no-deps"

BOTO_DEPS=("boto*" "urllib3*" "*dateutil*" "s3transfer*" "*jmespath*" "*six*")
XRAY_DEPS=("*wrapt*" "*future*" "*aws_xray_sdk*" "libfuturize" "libpasteurize" "past" "bin")
POWERTOOLS_DEPS=("fastjsonschema")

function main() {
    pip install aws-lambda-powertools -t ${DEST_DIR}
    pushd ${DEST_DIR}
    for d in "${BOTO_DEPS[*]}"; do rm -rf ${d}; done
    for d in "${XRAY_DEPS[*]}"; do rm -rf ${d}; done
    for d in "${POWERTOOLS_DEPS[*]}"; do rm -rf ${d}; done
    popd
    du -hs ${DEST_DIR}
}

main

@mccauleyp
Copy link
Author

mccauleyp commented Nov 23, 2021

Hey @heitorlessa & @am29d, thanks for your replies! I've +1'd the boto modularization ticket as above as requested. 

To me it seems like removing boto* from the layer bundle is the most straightforward and easy thing to do for now, as Alex suggested. That's consistent with what the Data Wrangler team is doing, and naively it seems reasonable that Lambda Powertools should guarantee compatibility with the Lambda runtime versions of boto*. What's the motivation for not sticking with the built-in versions to begin with? I guess they're not updated as regularly?

Another near-term solution might be releasing an optional layer archive with boto* removed, similar to the "extra dependencies" layer version that you already have. Then perhaps make that the default if/when it can be verified that Powertools is fully compatible with the built-in boto* versions. 

@lorengordon
Copy link

It has certainly been frustrating for me in the past that the lambda runtime is not updated frequently with the current botocore/boto3 versions. I'm not sure of the current state, but there have been times when the versions are years out of date. Which certainly does make testing un-fun, when the code works locally but then fails when deployed, without also packaging boto into the lambda. :(

Just an observation. There probably are good ways for this project to handle the change, ensuring coverage for the services/methods it uses. I figure it will likely involve the user being a bit more aware of and responsible for their packaging and layer configuration.

@heitorlessa
Copy link
Contributor

100% @lorengordon - I think we can meet in the middle. If we can run integration tests upon merging code in develop, run them within the actual Lambda runtime, and update the documentation that PyPi is the only source to obtain the latest boto3/botocore features/bugfixes this could work.

I thought of us using LambCI for that but it seems no longer maintained. One of those that require going back to the "whiteboarding" to figure out a good balance - We should download the latest code from GitHub, run integration tests within the Lambda runtime, and report back if a functionality isn't working.

@mccauleyp as @lorengordon mentioned, these are updated irregularly - could be weeks, months, or even 1 year+. This could have the unexpected side effects, where your Lambda function works fine locally (unit tests, etc.) while not when deployed -- the Data Wrangler team isn't testing for that, it's something they want to fix too and whatever we come up with would be helpful to both teams (and whoever comes next).

@mwarkentin
Copy link

@heitorlessa I've voted for the boto modularization - but I'm curious if there are any internal feature requests for increasing the max lambda size? Seems like another pretty good solution would be a limit increase on the max size of Lambdas - any PFRs we could throw our weight behind on that side of things? It's been 256MB as long as I can remember..

@heitorlessa
Copy link
Contributor

Hey @mwarkentin the Container Image supports up to 10G, not Zip (256M). Container Image however has a perf impact, and depending on how big the zip is it could incur Perf impact too.

I'll add a +1 to the PFR to increase the zip as I already know your company name ;-)

You can also use #awswishlist on Twitter for it to get automatically added/bumped.

@lorengordon
Copy link

@heitorlessa Care to share the specific PFR? I can add weight to that one also.

@DanyC97
Copy link
Contributor

DanyC97 commented Dec 6, 2021

I thought of us using LambCI for that but it seems no longer maintained.

👋 @heitorlessa just been looking into this a bit and bumped into https://github.com/aws/aws-lambda-base-images/tree/python3.9 . Not sure if is like-for-like but maybe you can have a chat with the maintainers ? (sadly they are not very responsive with the GH issues hence hard to know it from outside)

@heitorlessa
Copy link
Contributor

@heitorlessa Care to share the specific PFR? I can add weight to that one also.

@lorengordon the PFR is not public - Lambda doesn't have a public roadmap intake. As AWS staff, we can add customer names to it and use any details you can share - e.g. is this a blocker? a critical improvement? a nice to have?

@DanyC97 thanks for the link! These would be for running OCI images on Lambda. They're not like-for-like per se with what's the provided runtime has.

@mccauleyp
Copy link
Author

The latest versions of the AWS Data Wrangler and Lambda Powertools Layers now seem to be compatible. Wrangler version 2.13.0 was released a few days ago, and I'm able to deploy that with the latest Powertools layer without hitting the size limit. I'm using the Wrangler SAR option and the Powertools region-specific layer ARN:4. I didn't actually dig into the size differences to see what exactly changed.

I think it's still worth trying to trim out the Boto dependencies if possible because there's not much overhead left for additional dependencies, but my original motivation for opening this ticket is resolved for now.

@heitorlessa
Copy link
Contributor

heitorlessa commented Dec 29, 2021

EDIT: Clarify the idea of removing boto from Lambda Layer not PyPi.
EDIT 2: Point out potential problems with PyPi typosquatting attacks and namespace ownership

I'm moving this to our official roadmap and provide some updates for posterity.

Lambda team confirmed runtime SDKs are being updated a few times a year, and the Runtime docs are being updated to reflect the latest pinned versions: https://docs.aws.amazon.com/lambda/latest/dg/lambda-runtimes.html

This gives us confidence it's no longer 6-12 months outdated as we experienced in the past. It also means we can safely remove boto3/botocore from Lambda Layers/SAR App given our feature set; cc @am29d.

Note: It will only be 100% guaranteed though when we have E2E tests running without boto. Stability is hugely important to us as we have hundreds of customers using Powertools in prod.

For this reason, we will work on two major pieces of work: 1/ an E2E test mechanism upon merge on develop, 2/ make boto along with any dependencies extra (optional) in v2. We need 1/ before we proceed in removing boto in Lambda Layer (not from PyPi).

In the ideal world, I'd like to invest in modularization by investing in Implicit Namespace Packages (PEP420). However I'm concerned on the operational cost to get this working smoothly on releases, and the potential attack vectors since anyone could publish a micropackage with our namespace -- I'd estimate 6 months of work to experiment, automate, and keep the maintenance costs low.

If anyone reading this later have experience with Python microlibs and Poetry, please do reach out - @heitor_lessa.

Thank you everyone!

@heitorlessa heitorlessa changed the title Reduce size of Lambda layer .zip file by removing botocore already in Lambda runtime Remove botocore from Lambda Layer to reduce package size by 90.5% Dec 29, 2021
@heitorlessa heitorlessa transferred this issue from aws-powertools/powertools-lambda-python Dec 29, 2021
@heitorlessa heitorlessa changed the title Remove botocore from Lambda Layer to reduce package size by 90.5% Remove boto from Lambda Layer to reduce package size by 90.5% Dec 29, 2021
@heitorlessa heitorlessa changed the title Remove boto from Lambda Layer to reduce package size by 90.5% Remove AWS SDK from Lambda Layer to reduce package size by 90.5% Dec 29, 2021
@heitorlessa heitorlessa transferred this issue from aws-powertools/powertools-lambda Apr 28, 2022
@heitorlessa
Copy link
Contributor

heitorlessa commented Apr 28, 2022

Update: @am29d is working on migrating our internal Lambda Layers pipeline to CodePipeline in a Powertools AWS account. In parallel, @mploski is working on our E2E test framework. Both will give us the confidence we need to remove Boto at runtime.

You can follow progress here: github.com/orgs/awslabs/projects/51/views/11

@heitorlessa
Copy link
Contributor

We found a solution! @mploski will be working on some rigorous tests to ensure we don't break anyone when including a ~500K botocore in the final PyPi asset and Lambda Layer, but I'm super super excited we've made progress here with the help from the AWS Python SDK team <3

@heitorlessa
Copy link
Contributor

We're enabling E2E at the merge level and will be able to test this more confidently as we increase coverage. Meantime, @mploski is finishing another project in his day-to-day role and will resume exploratory tests on the new squashed boto to see whether we accidentally cause any conflict with the Lambda's boto runtime dep

@rubenfonseca
Copy link
Contributor

rubenfonseca commented Oct 7, 2022

We ran some load testing to measure the impact of the layer size in the coldstart. Findings:

1/ Layer size doesn't seems to have a meaningful impact on the coldstart time (at least when the layer doubles in size and it's < 5M)
2/ Using native compiled code (cython) reduces coldstart time (even though the layer size is bigger)

Based on this we're launching the v2 layer with compiled native code for each architecture.

Current Lambda Layer size: 2.6MB

@heitorlessa
Copy link
Contributor

@mccauleyp it took us a while but you'll be please to see V2 results this month ;) we now have E2E framework in place to detect regressions in case Lambda runtime changes a dependency ahead of a release.

Thank you for sticking with us all this time!

@heitorlessa heitorlessa assigned rubenfonseca and unassigned mploski Oct 7, 2022
@rubenfonseca
Copy link
Contributor

For reference here are the benchmark results we did to measure cold start impact of loading the different libraries:

arm64, 128mb

Baseline (empty handler, no powertools):

coldStart count p50 p90 p99 max
1 16 114.55 133.96 141.66 141.66

Just logger:

coldStart count p50 p90 p99 max
1 10 144.12 158.07 161.87 161.87

Just parser:

coldStart count p50 p90 p99 max
1 10 443.89 465.09 471.02 471.02

Logger + Parser:

coldStart count p50 p90 p99 max
1 10 449.56 468.42 469.58 469.58

Just tracer:

coldStart count p50 p90 p99 max
1 15 293.8763 309.8637 326.7208 330.03

@heitorlessa
Copy link
Contributor

heitorlessa commented Oct 18, 2022

Changing it to Coming soon as we expect to launch V2 with these optimizations by EOW.

Updates: #1459 (comment)

@github-actions
Copy link
Contributor

⚠️COMMENT VISIBILITY WARNING⚠️

This issue is now closed. Please be mindful that future comments are hard for our team to see.

If you need more assistance, please either tag a team member or open a new issue that references this one.

If you wish to keep having a conversation with other community members under this issue feel free to do so.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature-request feature request
Projects
None yet
Development

No branches or pull requests

8 participants