Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AWS python lambda execution time get doubled in cryptography >= 3.1 #6301

Closed
prateek-singh-wcar opened this issue Sep 24, 2021 · 20 comments
Closed

Comments

@prateek-singh-wcar
Copy link

Hello
We have a AWS lambda and when we are upgrading cryptography version > 3.0 like 3.1 , the lambda execution time just get doubled and working fine till cryptography<= 3.0

I can see, there are few new things added in v3.1 like PKCS7 support for certs and cipher context, So can you please suggest why its taking so much time in lambda execution for cryptography version >= 3.1 .

We are using below versions of libs -

Python == 3.7
cffi == 1.14.5
cryptography == 3.1
pip == 20.2.4
setuptools == 50.3.2

Please have a look and suggest if there is something which is causing the issue.

@alex
Copy link
Member

alex commented Sep 24, 2021

Your issue doesn't have enough details for us to actually debug.

pyca/cryptography has many different features, without you providing sample code that's slowed down we have no idea what might have gotten slower.

@prateek-singh-wcar
Copy link
Author

prateek-singh-wcar commented Sep 24, 2021

Here is the sample code -

import python_jwt  #python-jwt==3.3.0

return python_jwt.generate_jwt(payload,
                                   priv_key=SECRET_KEY,
                                   algorithm=ALGORITHM,
                                   other_headers=additional_headers)

Please have a look. 👍

@alex
Copy link
Member

alex commented Sep 24, 2021

Can you please provide a fully executable script that we can actually run and benchmark. This code by itself isn't enough to actually execute (I have no idea what hte value of ALGORITHM is for example)

@0xaead
Copy link
Contributor

0xaead commented Sep 25, 2021

@prateek-singh-wcar the sample code you provided reveals that there's another layer of abstraction by jwcrypto on top of pyca/cryptograph. But you didn't provide the version of it yet.

return python_jwt.generate_jwt(payload,
                                   priv_key=SECRET_KEY,
                                   algorithm=ALGORITHM,
                                   other_headers=additional_headers) 

More importantly, above code is not a sololy executable snippet. Even if return can be eliminated supposing your point is that python_jwt() is having some performance issue, you should at least provide examples of all the parameters it takes so others can reproduce the issue with their environment.

To solve the issue you might also need to answer some general questions like: Had you run the code on a native environment instead of cloud? Did you use some module like timeit to identify the hotspot and what is the result?

@prateek-singh-wcar
Copy link
Author

Hi
Here is the details along with answer of question you asked -

Had you run the code on a native environment instead of cloud? Yes we have run the script at different native system and its have same behavior.

Did you use some module like timeit to identify the hotspot and what is the result? No , we did not use timeit

We are using jwcrypto==0.9.1

@prateek-singh-wcar
Copy link
Author

prateek-singh-wcar commented Sep 27, 2021

Here is the more specific script -

#from calendar import timegm

def create_token():
	custom_policies = {
                             "Env":"Dev1",
                             "Work":"AsDev"}
    expiry_time = timegm(datetime.utcnow().utctimetuple()) + 60
    private_key = get_your_private_key()
   key_id = get_your_kid #if there is any

    payload = {'iss': 'Some issuere',
               'sub': 'TestTokenSub',
               'sub-alias': 'TestToken',
               'aud': 'Only_My_Team',
               'aud-alias': 'Team_Rocks',
               'exp': expiry_time,
               'kid': key_id,
               'policy': custom_policies}

    additional_headers = {'kid': key_id}

    return python_jwt.generate_jwt(payload,
                                   priv_key=private_key,
                                   algorithm= RS256,
                                   other_headers=additional_headers)
                                   

Hope that that should work for your debugging stuff.

@tiran
Copy link
Contributor

tiran commented Sep 27, 2021

Could you please provide a run-able script? The code is missing several imports and function definitions for get_your_private_key() and get_your_kid().

@reaperhulk
Copy link
Member

The request is for a script that can be run without modification that demonstrates the performance problem such that we can reproduce it with multiple versions of cryptography. The code provided quite clearly does not run as it is missing variables, has a commented out import that is required inside the function body, and the functionality in question is inside a function that is never called. I'm disappointed that we've had to make such a basic request multiple times when we're attempting to volunteer our time to assist you. To be more explicit you'll need to provide the following:

  • A requirements.txt (or its string equivalent!) that contains the exact set of dependencies required to execute your script and is pinned to the versions you're using. You sort of provided this in the initial post, but there's also been conflicting information about usage of python-jwt and jwcrypto. We need accurate information.
  • A script that, when executed in a virtual environment with the requirements from requirements.txt installed, reproduces the problem with no input required. e.g. I should execute python example_script.py in the venv and it works.

@reaperhulk
Copy link
Member

Closing due to lack of activity. If the problem still exists feel free to reopen and provide the information from the previous comment so we can replicate and investigate.

@ilya-shabalin-wcar
Copy link

ilya-shabalin-wcar commented Oct 6, 2021

Hi, sorry for not following up earlier - we had paused the bump for now. Let's see if following code can help to investigate:


from calendar import timegm
from datetime import datetime


def create_token():
    custom_policies = {"Env": "Dev1", "Work": "AsDev"}
    expiry_time = timegm(datetime.utcnow().utctimetuple()) + 60
    private_key = 'MIIBVwIBADANBgkqhkiG9w0BAQEFAASCAUEwggE9AgEAAkEAvGNaerZ1lX7C94IC' \
                  'kPb6raDxY9Ecs0eBlvCL1r+2PfeORbMEOtEc94m6za9tElWF7yyCAPBoWAmoyF6c' \
                  'y+dZFwIDAQABAkEAnTyfzER2J+zBdiwBUFWULU9jqthiiBBqzTzAuuEQju6dEkav' \
                  '4F/3h2I5PZffIsl08TVXlUXjPjTJiMZeIUsbIQIhAO1Vafxq8MFi7E0XOFb1zipV' \
                  'GkYR4oH4MYeaYIz1DtWzAiEAyzRybgKvGuerL2ACeWj/lK+utp1G9PnMv7bpwWce' \
                  'BQ0CIQC/OfQUuwHTf+ZP0ybFAnckKr/pt11fRWB7F0s2J/nU+wIhAJpAnOC/rtbY' \
                  'p80Vr6skNmdTGWhMabHK8k9R9Y43zcT9AiEAiE1Fu8lQyYNoiqvEcG0w3JNh+C9A' \
                  '1vvVtwPDil+mn/c='
    payload = {'iss': 'Some issuer',
               'sub': 'TestTokenSub',
               'sub-alias': 'TestToken',
               'aud': 'Only_My_Team',
               'aud-alias': 'Team_Rocks',
               'exp': expiry_time,
               'kid': '2021-09-19',
               'policy': custom_policies}
    additional_headers = {'kid': '2021-09-19'}
    return python_jwt.generate_jwt(payload,
                                   priv_key=private_key,
                                   algorithm='RS256',
                                   other_headers=additional_headers)

Current requirements.txt (the version specified here is 2.6.1, when we bump to >3.0 - duration increases, no other changes to code / deps are done in combination to bump)

asn1crypto==1.4.0
# via cryptography
boto3==1.18.46
# via access-management-m2m-dynamodb
botocore==1.21.46
# via
# access-management-m2m-dynamodb
# boto3
# s3transfer
cffi==1.14.6
# via cryptography
cryptography==2.6.1
# via
# -r requirements.in
# jwcrypto
jwcrypto==1.0
# via python-jwt
six==1.16.0
# via
# cryptography
# ecdsa
# python-dateutil
pyasn1==0.4.8
# via
# python-jose
# rsa
pycparser==2.20
# via cffi
python-dateutil==2.8.2
# via botocore
python-jose==3.3.0
# via -r requirements.in
python-jwt==3.3.0
# via -r requirements.in
requests==2.26.0
# via -r requirements.in
rsa==4.7.2
# via python-jose
s3transfer==0.5.0
# via boto3

@prateek-singh-wcar
Copy link
Author

@reaperhulk can you please re open it.

@tiran
Copy link
Contributor

tiran commented Oct 6, 2021

The code is not runable. It's missing imports and a if __name__ == "__main__": block.

@gustavo-siqueira-wcar
Copy link

Hi @tiran here's a runnable code:

import timeit

setup = """
from calendar import timegm
from datetime import datetime

import python_jwt
from jwcrypto import jwk

custom_policies = {"Env": "Dev1", "Work": "AsDev"}
expiry_time = timegm(datetime.utcnow().utctimetuple()) + 60
key = jwk.JWK.generate(kty='RSA', size=2048)
payload = {'iss': 'Some issuer',
            'sub': 'TestTokenSub',
            'sub-alias': 'TestToken',
            'aud': 'Only_My_Team',
            'aud-alias': 'Team_Rocks',
            'exp': expiry_time,
            'kid': '2021-09-19',
            'policy': custom_policies}
additional_headers = {'kid': '2021-09-19'}
"""

stmt = """
python_jwt.generate_jwt(
    payload,
    priv_key=key,
    algorithm='RS256',
    other_headers=additional_headers
)
"""


if __name__ == "__main__":
    print(timeit.timeit(stmt=stmt, setup=setup, number=5))

env with cryptography==2.6.1:

asn1crypto==1.4.0
cffi==1.14.6
cryptography==2.6.1
Deprecated==1.2.13
jwcrypto==1.0
pycparser==2.20
python-jwt==3.3.0
six==1.16.0
wrapt==1.13.1

env with Cryptography==3.2:

cffi==1.14.6
cryptography==3.2
Deprecated==1.2.13
jwcrypto==1.0
pycparser==2.20
python-jwt==3.3.0
six==1.16.0
wrapt==1.13.1

And here's the difference between the outputs from timeit for each version:
Using 2.6.1: 0.007067927999742096
Using 3.2: 0.03307457300252281

@tiran
Copy link
Contributor

tiran commented Oct 6, 2021

Thank you, that's a perfect!

I can reproduce the issue on my laptop:

2.6.1: 0.006050721975043416
3.2: 0.019777562003582716
3.4.8: 0.02039522398263216

@tiran
Copy link
Contributor

tiran commented Oct 6, 2021

It looks like the performance issue is caused by RSA_check_key, which was introduced in PR #5307.

1,000 iterations of python_jwt.generate_jwt() with 3.4.8:

         1328307 function calls (1242804 primitive calls) in 4.404 seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
     1001    2.838    0.003    2.838    0.003 {built-in method _openssl.RSA_check_key}
     2000    0.564    0.000    0.564    0.000 {built-in method _openssl.EVP_PKEY_sign}
     1001    0.368    0.000    0.368    0.000 {built-in method _openssl.RSA_blinding_on}
85000/1000    0.084    0.000    0.197    0.000 copy.py:128(deepcopy)
        1    0.032    0.032    0.032    0.032 {built-in method _openssl.RSA_generate_key_ex}
     5000    0.027    0.000    0.027    0.000 encoder.py:204(iterencode)
    12000    0.020    0.000    0.112    0.000 copy.py:258(_reconstruct)
   200041    0.017    0.000    0.017    0.000 {method 'get' of 'dict' objects}
    72000    0.016    0.000    0.080    0.000 copy.py:263(<genexpr>)
     1000    0.014    0.000    4.300    0.004 __init__.py:17(generate_jwt)

1,000 iterations with 2.6.1:

         1345377 function calls (1259535 primitive calls) in 1.541 seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
     2000    0.556    0.000    0.556    0.000 {built-in method _openssl.EVP_PKEY_sign}
     1000    0.367    0.000    0.367    0.000 {built-in method _openssl.RSA_blinding_on}
85000/1000    0.084    0.000    0.195    0.000 copy.py:128(deepcopy)
     5000    0.026    0.000    0.026    0.000 encoder.py:204(iterencode)
    12000    0.020    0.000    0.112    0.000 copy.py:258(_reconstruct)
   200986    0.017    0.000    0.017    0.000 {method 'get' of 'dict' objects}
    72000    0.016    0.000    0.080    0.000 copy.py:263(<genexpr>)
        1    0.015    0.015    0.015    0.015 {built-in method _openssl.RSA_generate_key_ex}
     1000    0.013    0.000    1.442    0.001 __init__.py:17(generate_jwt)
   105726    0.012    0.000    0.017    0.000 {built-in method builtins.isinstance}

@reaperhulk
Copy link
Member

reaperhulk commented Oct 6, 2021

Ah, interesting. We actually have an unofficial method for bypassing that check but it will never be part of our public API because it has serious security implications. You can see where it’s set

self._rsa_skip_check_key = False
and where it’s consumed:
def __init__(self, backend, rsa_cdata, evp_pkey, _skip_check_key):
# RSA_check_key is slower in OpenSSL 3.0.0 due to improved
# primality checking. In normal use this is unlikely to be a problem
# since users don't load new keys constantly, but for TESTING we've
# added an init arg that allows skipping the checks. You should not
# use this in production code unless you understand the consequences.

@tiran
Copy link
Contributor

tiran commented Oct 6, 2021

Cryptography could keep a positive cache for checked keys, too. This may not help with lambdas, but it will speed up long-running processes that load the same key over and over again.

 class _RSAPrivateKey(RSAPrivateKey):
+    # positive cache for checked keys
+    _checked_keys = {}
+
     def __init__(self, backend, rsa_cdata, evp_pkey, _skip_check_key):
         # RSA_check_key is slower in OpenSSL 3.0.0 due to improved
         # primality checking. In normal use this is unlikely to be a problem
         # since users don't load new keys constantly, but for TESTING we've
         # added an init arg that allows skipping the checks. You should not
         # use this in production code unless you understand the consequences.
-        if not _skip_check_key:
+        if not _skip_check_key and hash(rsa_cdata) not in self._checked_keys:
             res = backend._lib.RSA_check_key(rsa_cdata)
             if res != 1:
                 errors = backend._consume_errors_with_text()
                 raise ValueError("Invalid private key", errors)
+            self._checked_keys.add(hash(rsa_cdata))
 
         # Blinding is on by default in many versions of OpenSSL, but let's
         # just be conservative here.

@reaperhulk
Copy link
Member

That check won’t be useful because definitionally if we’re hitting this case repeatedly the user is loading the key repeatedly instead of caching for reuse themselves. So we’re allocating new openssl rsa_cdata structures.

@alex
Copy link
Member

alex commented Oct 6, 2021

I've dug deep into the belly of these JWT libs. It's a design issue there. The problem is that every time you attempt to do any cryptographic operation, it reloads the key from scratch: https://github.com/latchset/jwcrypto/blob/14c942d616c850e33c9fa11a38b0fc8bbb0a055c/jwcrypto/jwk.py#L826-L828

While the more comprehensive RSA primarily checks exacerbte this, it was already quite wasteful even before.

The correct solution is, once you have an RSAPrivateKey object, to simply keep it around, rather than throwing it away and reparsing every time.

@simo5
Copy link
Contributor

simo5 commented Nov 30, 2021

FYI: latchset/jwcrypto#254

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Mar 1, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Development

No branches or pull requests

8 participants