AWS python lambda execution time get doubled in cryptography >= 3.1 #6301

prateek-singh-wcar · 2021-09-24T11:16:34Z

Hello
We have a AWS lambda and when we are upgrading cryptography version > 3.0 like 3.1 , the lambda execution time just get doubled and working fine till cryptography<= 3.0

I can see, there are few new things added in v3.1 like PKCS7 support for certs and cipher context, So can you please suggest why its taking so much time in lambda execution for cryptography version >= 3.1 .

We are using below versions of libs -

Python == 3.7
cffi == 1.14.5
cryptography == 3.1
pip == 20.2.4
setuptools == 50.3.2

Please have a look and suggest if there is something which is causing the issue.

alex · 2021-09-24T11:22:06Z

Your issue doesn't have enough details for us to actually debug.

pyca/cryptography has many different features, without you providing sample code that's slowed down we have no idea what might have gotten slower.

prateek-singh-wcar · 2021-09-24T13:00:00Z

Here is the sample code -

import python_jwt  #python-jwt==3.3.0

return python_jwt.generate_jwt(payload,
                                   priv_key=SECRET_KEY,
                                   algorithm=ALGORITHM,
                                   other_headers=additional_headers)

Please have a look. 👍

alex · 2021-09-24T22:55:17Z

Can you please provide a fully executable script that we can actually run and benchmark. This code by itself isn't enough to actually execute (I have no idea what hte value of ALGORITHM is for example)

0xaead · 2021-09-25T20:06:52Z

@prateek-singh-wcar the sample code you provided reveals that there's another layer of abstraction by jwcrypto on top of pyca/cryptograph. But you didn't provide the version of it yet.

return python_jwt.generate_jwt(payload,
                                   priv_key=SECRET_KEY,
                                   algorithm=ALGORITHM,
                                   other_headers=additional_headers)

More importantly, above code is not a sololy executable snippet. Even if return can be eliminated supposing your point is that python_jwt() is having some performance issue, you should at least provide examples of all the parameters it takes so others can reproduce the issue with their environment.

To solve the issue you might also need to answer some general questions like: Had you run the code on a native environment instead of cloud? Did you use some module like timeit to identify the hotspot and what is the result?

prateek-singh-wcar · 2021-09-27T08:57:12Z

Hi
Here is the details along with answer of question you asked -

Had you run the code on a native environment instead of cloud? Yes we have run the script at different native system and its have same behavior.

Did you use some module like timeit to identify the hotspot and what is the result? No , we did not use timeit

We are using jwcrypto==0.9.1

prateek-singh-wcar · 2021-09-27T09:43:06Z

Here is the more specific script -

#from calendar import timegm

def create_token():
	custom_policies = {
                             "Env":"Dev1",
                             "Work":"AsDev"}
    expiry_time = timegm(datetime.utcnow().utctimetuple()) + 60
    private_key = get_your_private_key()
   key_id = get_your_kid #if there is any

    payload = {'iss': 'Some issuere',
               'sub': 'TestTokenSub',
               'sub-alias': 'TestToken',
               'aud': 'Only_My_Team',
               'aud-alias': 'Team_Rocks',
               'exp': expiry_time,
               'kid': key_id,
               'policy': custom_policies}

    additional_headers = {'kid': key_id}

    return python_jwt.generate_jwt(payload,
                                   priv_key=private_key,
                                   algorithm= RS256,
                                   other_headers=additional_headers)

Hope that that should work for your debugging stuff.

tiran · 2021-09-27T09:50:07Z

Could you please provide a run-able script? The code is missing several imports and function definitions for get_your_private_key() and get_your_kid().

reaperhulk · 2021-09-27T09:58:05Z

The request is for a script that can be run without modification that demonstrates the performance problem such that we can reproduce it with multiple versions of cryptography. The code provided quite clearly does not run as it is missing variables, has a commented out import that is required inside the function body, and the functionality in question is inside a function that is never called. I'm disappointed that we've had to make such a basic request multiple times when we're attempting to volunteer our time to assist you. To be more explicit you'll need to provide the following:

A requirements.txt (or its string equivalent!) that contains the exact set of dependencies required to execute your script and is pinned to the versions you're using. You sort of provided this in the initial post, but there's also been conflicting information about usage of python-jwt and jwcrypto. We need accurate information.
A script that, when executed in a virtual environment with the requirements from requirements.txt installed, reproduces the problem with no input required. e.g. I should execute python example_script.py in the venv and it works.

reaperhulk · 2021-10-04T21:51:24Z

Closing due to lack of activity. If the problem still exists feel free to reopen and provide the information from the previous comment so we can replicate and investigate.

ilya-shabalin-wcar · 2021-10-06T08:42:53Z

Hi, sorry for not following up earlier - we had paused the bump for now. Let's see if following code can help to investigate:


from calendar import timegm
from datetime import datetime


def create_token():
    custom_policies = {"Env": "Dev1", "Work": "AsDev"}
    expiry_time = timegm(datetime.utcnow().utctimetuple()) + 60
    private_key = 'MIIBVwIBADANBgkqhkiG9w0BAQEFAASCAUEwggE9AgEAAkEAvGNaerZ1lX7C94IC' \
                  'kPb6raDxY9Ecs0eBlvCL1r+2PfeORbMEOtEc94m6za9tElWF7yyCAPBoWAmoyF6c' \
                  'y+dZFwIDAQABAkEAnTyfzER2J+zBdiwBUFWULU9jqthiiBBqzTzAuuEQju6dEkav' \
                  '4F/3h2I5PZffIsl08TVXlUXjPjTJiMZeIUsbIQIhAO1Vafxq8MFi7E0XOFb1zipV' \
                  'GkYR4oH4MYeaYIz1DtWzAiEAyzRybgKvGuerL2ACeWj/lK+utp1G9PnMv7bpwWce' \
                  'BQ0CIQC/OfQUuwHTf+ZP0ybFAnckKr/pt11fRWB7F0s2J/nU+wIhAJpAnOC/rtbY' \
                  'p80Vr6skNmdTGWhMabHK8k9R9Y43zcT9AiEAiE1Fu8lQyYNoiqvEcG0w3JNh+C9A' \
                  '1vvVtwPDil+mn/c='
    payload = {'iss': 'Some issuer',
               'sub': 'TestTokenSub',
               'sub-alias': 'TestToken',
               'aud': 'Only_My_Team',
               'aud-alias': 'Team_Rocks',
               'exp': expiry_time,
               'kid': '2021-09-19',
               'policy': custom_policies}
    additional_headers = {'kid': '2021-09-19'}
    return python_jwt.generate_jwt(payload,
                                   priv_key=private_key,
                                   algorithm='RS256',
                                   other_headers=additional_headers)

Current requirements.txt (the version specified here is 2.6.1, when we bump to >3.0 - duration increases, no other changes to code / deps are done in combination to bump)

asn1crypto==1.4.0
# via cryptography
boto3==1.18.46
# via access-management-m2m-dynamodb
botocore==1.21.46
# via
# access-management-m2m-dynamodb
# boto3
# s3transfer
cffi==1.14.6
# via cryptography
cryptography==2.6.1
# via
# -r requirements.in
# jwcrypto
jwcrypto==1.0
# via python-jwt
six==1.16.0
# via
# cryptography
# ecdsa
# python-dateutil
pyasn1==0.4.8
# via
# python-jose
# rsa
pycparser==2.20
# via cffi
python-dateutil==2.8.2
# via botocore
python-jose==3.3.0
# via -r requirements.in
python-jwt==3.3.0
# via -r requirements.in
requests==2.26.0
# via -r requirements.in
rsa==4.7.2
# via python-jose
s3transfer==0.5.0
# via boto3

prateek-singh-wcar · 2021-10-06T08:50:11Z

@reaperhulk can you please re open it.

tiran · 2021-10-06T09:15:16Z

The code is not runable. It's missing imports and a if __name__ == "__main__": block.

gustavo-siqueira-wcar · 2021-10-06T10:02:01Z

Hi @tiran here's a runnable code:

import timeit

setup = """
from calendar import timegm
from datetime import datetime

import python_jwt
from jwcrypto import jwk

custom_policies = {"Env": "Dev1", "Work": "AsDev"}
expiry_time = timegm(datetime.utcnow().utctimetuple()) + 60
key = jwk.JWK.generate(kty='RSA', size=2048)
payload = {'iss': 'Some issuer',
            'sub': 'TestTokenSub',
            'sub-alias': 'TestToken',
            'aud': 'Only_My_Team',
            'aud-alias': 'Team_Rocks',
            'exp': expiry_time,
            'kid': '2021-09-19',
            'policy': custom_policies}
additional_headers = {'kid': '2021-09-19'}
"""

stmt = """
python_jwt.generate_jwt(
    payload,
    priv_key=key,
    algorithm='RS256',
    other_headers=additional_headers
)
"""


if __name__ == "__main__":
    print(timeit.timeit(stmt=stmt, setup=setup, number=5))

env with cryptography==2.6.1:

asn1crypto==1.4.0
cffi==1.14.6
cryptography==2.6.1
Deprecated==1.2.13
jwcrypto==1.0
pycparser==2.20
python-jwt==3.3.0
six==1.16.0
wrapt==1.13.1

env with Cryptography==3.2:

cffi==1.14.6
cryptography==3.2
Deprecated==1.2.13
jwcrypto==1.0
pycparser==2.20
python-jwt==3.3.0
six==1.16.0
wrapt==1.13.1

And here's the difference between the outputs from timeit for each version:
Using 2.6.1: 0.007067927999742096
Using 3.2: 0.03307457300252281

tiran · 2021-10-06T10:09:56Z

Thank you, that's a perfect!

I can reproduce the issue on my laptop:

2.6.1: 0.006050721975043416
3.2: 0.019777562003582716
3.4.8: 0.02039522398263216

tiran · 2021-10-06T10:20:32Z

It looks like the performance issue is caused by RSA_check_key, which was introduced in PR #5307.

1,000 iterations of python_jwt.generate_jwt() with 3.4.8:

         1328307 function calls (1242804 primitive calls) in 4.404 seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
     1001    2.838    0.003    2.838    0.003 {built-in method _openssl.RSA_check_key}
     2000    0.564    0.000    0.564    0.000 {built-in method _openssl.EVP_PKEY_sign}
     1001    0.368    0.000    0.368    0.000 {built-in method _openssl.RSA_blinding_on}
85000/1000    0.084    0.000    0.197    0.000 copy.py:128(deepcopy)
        1    0.032    0.032    0.032    0.032 {built-in method _openssl.RSA_generate_key_ex}
     5000    0.027    0.000    0.027    0.000 encoder.py:204(iterencode)
    12000    0.020    0.000    0.112    0.000 copy.py:258(_reconstruct)
   200041    0.017    0.000    0.017    0.000 {method 'get' of 'dict' objects}
    72000    0.016    0.000    0.080    0.000 copy.py:263(<genexpr>)
     1000    0.014    0.000    4.300    0.004 __init__.py:17(generate_jwt)

1,000 iterations with 2.6.1:

         1345377 function calls (1259535 primitive calls) in 1.541 seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
     2000    0.556    0.000    0.556    0.000 {built-in method _openssl.EVP_PKEY_sign}
     1000    0.367    0.000    0.367    0.000 {built-in method _openssl.RSA_blinding_on}
85000/1000    0.084    0.000    0.195    0.000 copy.py:128(deepcopy)
     5000    0.026    0.000    0.026    0.000 encoder.py:204(iterencode)
    12000    0.020    0.000    0.112    0.000 copy.py:258(_reconstruct)
   200986    0.017    0.000    0.017    0.000 {method 'get' of 'dict' objects}
    72000    0.016    0.000    0.080    0.000 copy.py:263(<genexpr>)
        1    0.015    0.015    0.015    0.015 {built-in method _openssl.RSA_generate_key_ex}
     1000    0.013    0.000    1.442    0.001 __init__.py:17(generate_jwt)
   105726    0.012    0.000    0.017    0.000 {built-in method builtins.isinstance}

reaperhulk · 2021-10-06T10:24:24Z

Ah, interesting. We actually have an unofficial method for bypassing that check but it will never be part of our public API because it has serious security implications. You can see where it’s set

cryptography/src/cryptography/hazmat/backends/openssl/backend.py

Line 190 in 17aeaa6

self._rsa_skip_check_key = False

and where it’s consumed:

cryptography/src/cryptography/hazmat/backends/openssl/rsa.py

Lines 385 to 390 in 17aeaa6

    
           def __init__(self, backend, rsa_cdata, evp_pkey, _skip_check_key): 
        
               # RSA_check_key is slower in OpenSSL 3.0.0 due to improved 
        
               # primality checking. In normal use this is unlikely to be a problem 
        
               # since users don't load new keys constantly, but for TESTING we've 
        
               # added an init arg that allows skipping the checks. You should not 
        
               # use this in production code unless you understand the consequences.

tiran · 2021-10-06T10:35:59Z

Cryptography could keep a positive cache for checked keys, too. This may not help with lambdas, but it will speed up long-running processes that load the same key over and over again.

 class _RSAPrivateKey(RSAPrivateKey):
+    # positive cache for checked keys
+    _checked_keys = {}
+
     def __init__(self, backend, rsa_cdata, evp_pkey, _skip_check_key):
         # RSA_check_key is slower in OpenSSL 3.0.0 due to improved
         # primality checking. In normal use this is unlikely to be a problem
         # since users don't load new keys constantly, but for TESTING we've
         # added an init arg that allows skipping the checks. You should not
         # use this in production code unless you understand the consequences.
-        if not _skip_check_key:
+        if not _skip_check_key and hash(rsa_cdata) not in self._checked_keys:
             res = backend._lib.RSA_check_key(rsa_cdata)
             if res != 1:
                 errors = backend._consume_errors_with_text()
                 raise ValueError("Invalid private key", errors)
+            self._checked_keys.add(hash(rsa_cdata))
 
         # Blinding is on by default in many versions of OpenSSL, but let's
         # just be conservative here.

reaperhulk · 2021-10-06T10:39:57Z

That check won’t be useful because definitionally if we’re hitting this case repeatedly the user is loading the key repeatedly instead of caching for reuse themselves. So we’re allocating new openssl rsa_cdata structures.

alex · 2021-10-06T13:26:49Z

I've dug deep into the belly of these JWT libs. It's a design issue there. The problem is that every time you attempt to do any cryptographic operation, it reloads the key from scratch: https://github.com/latchset/jwcrypto/blob/14c942d616c850e33c9fa11a38b0fc8bbb0a055c/jwcrypto/jwk.py#L826-L828

While the more comprehensive RSA primarily checks exacerbte this, it was already quite wasteful even before.

The correct solution is, once you have an RSAPrivateKey object, to simply keep it around, rather than throwing it away and reparsing every time.

simo5 · 2021-11-30T21:30:58Z

FYI: latchset/jwcrypto#254

reaperhulk closed this as completed Oct 4, 2021

simo5 mentioned this issue Oct 6, 2021

Loading keys from pyca may be expensive latchset/jwcrypto#243

Closed

github-actions bot locked as resolved and limited conversation to collaborators Mar 1, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AWS python lambda execution time get doubled in cryptography >= 3.1 #6301

AWS python lambda execution time get doubled in cryptography >= 3.1 #6301

prateek-singh-wcar commented Sep 24, 2021

alex commented Sep 24, 2021

prateek-singh-wcar commented Sep 24, 2021 •

edited

Loading

alex commented Sep 24, 2021

0xaead commented Sep 25, 2021 •

edited

Loading

prateek-singh-wcar commented Sep 27, 2021

prateek-singh-wcar commented Sep 27, 2021 •

edited

Loading

tiran commented Sep 27, 2021

reaperhulk commented Sep 27, 2021

reaperhulk commented Oct 4, 2021

ilya-shabalin-wcar commented Oct 6, 2021 •

edited

Loading

prateek-singh-wcar commented Oct 6, 2021

tiran commented Oct 6, 2021

gustavo-siqueira-wcar commented Oct 6, 2021

tiran commented Oct 6, 2021

tiran commented Oct 6, 2021

reaperhulk commented Oct 6, 2021 •

edited

Loading

tiran commented Oct 6, 2021

reaperhulk commented Oct 6, 2021

alex commented Oct 6, 2021

simo5 commented Nov 30, 2021

AWS python lambda execution time get doubled in cryptography >= 3.1 #6301

AWS python lambda execution time get doubled in cryptography >= 3.1 #6301

Comments

prateek-singh-wcar commented Sep 24, 2021

alex commented Sep 24, 2021

prateek-singh-wcar commented Sep 24, 2021 • edited Loading

alex commented Sep 24, 2021

0xaead commented Sep 25, 2021 • edited Loading

prateek-singh-wcar commented Sep 27, 2021

prateek-singh-wcar commented Sep 27, 2021 • edited Loading

tiran commented Sep 27, 2021

reaperhulk commented Sep 27, 2021

reaperhulk commented Oct 4, 2021

ilya-shabalin-wcar commented Oct 6, 2021 • edited Loading

prateek-singh-wcar commented Oct 6, 2021

tiran commented Oct 6, 2021

gustavo-siqueira-wcar commented Oct 6, 2021

tiran commented Oct 6, 2021

tiran commented Oct 6, 2021

reaperhulk commented Oct 6, 2021 • edited Loading

tiran commented Oct 6, 2021

reaperhulk commented Oct 6, 2021

alex commented Oct 6, 2021

simo5 commented Nov 30, 2021

prateek-singh-wcar commented Sep 24, 2021 •

edited

Loading

0xaead commented Sep 25, 2021 •

edited

Loading

prateek-singh-wcar commented Sep 27, 2021 •

edited

Loading

ilya-shabalin-wcar commented Oct 6, 2021 •

edited

Loading

reaperhulk commented Oct 6, 2021 •

edited

Loading