Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(idempotency): support composite primary key in DynamoDBPersistenceLayer #740

Merged
merged 4 commits into from
Nov 9, 2021

Conversation

Tankanow
Copy link
Contributor

@Tankanow Tankanow commented Oct 6, 2021

Issue #694

Description of changes:

Add new optional parameters, key_attr_value and sort_key_attr, to DynamoDBPersistenceLayer. If sort_key_attr is set, then DynamoDBPersistenceLayer will use a composite primary key with the partition key set to a static value, defaults to powertools#idempotency, and the sort key set to the hashed idempotency key.

Checklist

Breaking change checklist

RFC issue #:

  • Migration process documented
  • Implement warnings (if it can live side by side)

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@pull-request-size pull-request-size bot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Oct 6, 2021
@@ -20,6 +20,8 @@ def __init__(
self,
table_name: str,
key_attr: str = "id",
key_attr_value: str = "powertools#idempotency",
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@heitorlessa, I created this as a Draft PR to get your early feedback on the interface. Once we agree on the ergonomics, I'm happy to update documentation and add examples.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@heitorlessa, I didn't see any feedback so I moved this from Draft to Ready. I would still like to update documentation and examples before merging this PR, and would love some feedback on the API before I take the time to do that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because DynamoDB IAM resource conditions don't allow you to restrict row level access based on SK, where the function name is known, I'd suggest we change the primary key slightly to: idempotency#function-name.

If they want, for example, to restrict anyone from ever modifying idempotency tokens for X function that has a prod stage or something more specific to their business in the name they now can.

Here are the scenarios I understood by going through the PR, as C is new - please correct me if I misunderstood them.


Scenarios

A) No hash key set.: Use default hash key and value

from aws_lambda_powertools.utilities.idempotency import DynamoDBPersistenceLayer

# KEY_ATTR="id"
# KEY_ATTR_VALUE="test-func#ec534ff6b4746107270b412333b59f52"
persistence_layer = DynamoDBPersistenceLayer(table_name="IdempotencyTable")

B) Hash key set.: Use new hash key and default value

# KEY_ATTR="idempotency_key"
# KEY_ATTR_VALUE="test-func#ec534ff6b4746107270b412333b59f52"
persistence_layer = DynamoDBPersistenceLayer(table_name="IdempotencyTable", key_attr="idempotency_key")

C) Hash and Sort key set.: Use new hash and sort key, and default value as sort key

# KEY_ATTR="pk"
# SORT_KEY_ATTR="sk"
# KEY_ATTR_VALUE="idempotency#test-func"
# SORT_KEY_ATTR_VALUE="test-func#ec534ff6b4746107270b412333b59f52"
persistence_layer = DynamoDBPersistenceLayer(
    table_name="IdempotencyTable", 
    key_attr="idempotency_key",
    sort_key_attr="idempotency_key")

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@heitorlessa, re the scenarios: yes, C is new. I think there is some copy/pasta in the code, the key_attr and sort_key_attr should match the comments, as in:

# KEY_ATTR="pk"
# SORT_KEY_ATTR="sk"
# KEY_ATTR_VALUE="idempotency#test-func"
# SORT_KEY_ATTR_VALUE="test-func#ec534ff6b4746107270b412333b59f52"
persistence_layer = DynamoDBPersistenceLayer(
    table_name="IdempotencyTable", 
    key_attr="pk",       # <- changed to match comment
    sort_key_attr="sk")  # <- changed to match comment

There is also a related scenario in which the DynamoDBPersistenceLayer passes the desired value for key_attr_value, e.g.:

# KEY_ATTR="pk"
# SORT_KEY_ATTR="sk"
# KEY_ATTR_VALUE="known-static-partition-key-value"
# SORT_KEY_ATTR_VALUE="test-func#ec534ff6b4746107270b412333b59f52"
persistence_layer = DynamoDBPersistenceLayer(
    table_name="IdempotencyTable", 
    key_attr="pk",
    key_attr_value="known-static-partition-key-value",       
    sort_key_attr="sk")

@Tankanow Tankanow marked this pull request as ready for review October 8, 2021 12:18
@heitorlessa
Copy link
Contributor

heitorlessa commented Oct 8, 2021 via email

@Tankanow
Copy link
Contributor Author

Tankanow commented Oct 8, 2021 via email

@heitorlessa heitorlessa self-assigned this Oct 8, 2021
@heitorlessa heitorlessa added the feature New feature or functionality label Oct 8, 2021
@heitorlessa
Copy link
Contributor

Not at all ;) I was thinking about the cost implications of having a single table for Idempotency -- as in, how would you know whether Idempotency is increasing the overall bill?

E.g. if you're Idempotency table is seeing a lot of activities, this is a good chance to verify whether you have a good strategy for tokens.

With a single table it's difficult to have this visibility due to other data entities being there.

That said, we still should support customers table that use a composite key -- I've got some unexpected meetings but haven't forgot about this yet, and it'll be part of the next release

Thank you for the patience @Tankanow

Copy link
Contributor

@heitorlessa heitorlessa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two small changes: 1/ Change PK to something IAM can restrict later, 2/ rollback test fixtures changes so we can do in a separate PR.

@@ -20,6 +20,8 @@ def __init__(
self,
table_name: str,
key_attr: str = "id",
key_attr_value: str = "powertools#idempotency",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because DynamoDB IAM resource conditions don't allow you to restrict row level access based on SK, where the function name is known, I'd suggest we change the primary key slightly to: idempotency#function-name.

If they want, for example, to restrict anyone from ever modifying idempotency tokens for X function that has a prod stage or something more specific to their business in the name they now can.

Here are the scenarios I understood by going through the PR, as C is new - please correct me if I misunderstood them.


Scenarios

A) No hash key set.: Use default hash key and value

from aws_lambda_powertools.utilities.idempotency import DynamoDBPersistenceLayer

# KEY_ATTR="id"
# KEY_ATTR_VALUE="test-func#ec534ff6b4746107270b412333b59f52"
persistence_layer = DynamoDBPersistenceLayer(table_name="IdempotencyTable")

B) Hash key set.: Use new hash key and default value

# KEY_ATTR="idempotency_key"
# KEY_ATTR_VALUE="test-func#ec534ff6b4746107270b412333b59f52"
persistence_layer = DynamoDBPersistenceLayer(table_name="IdempotencyTable", key_attr="idempotency_key")

C) Hash and Sort key set.: Use new hash and sort key, and default value as sort key

# KEY_ATTR="pk"
# SORT_KEY_ATTR="sk"
# KEY_ATTR_VALUE="idempotency#test-func"
# SORT_KEY_ATTR_VALUE="test-func#ec534ff6b4746107270b412333b59f52"
persistence_layer = DynamoDBPersistenceLayer(
    table_name="IdempotencyTable", 
    key_attr="idempotency_key",
    sort_key_attr="idempotency_key")

@Tankanow
Copy link
Contributor Author

Tankanow commented Oct 13, 2021

Not at all ;) I was thinking about the cost implications of having a single table for Idempotency -- as in, how would you know whether Idempotency is increasing the overall bill?

E.g. if you're Idempotency table is seeing a lot of activities, this is a good chance to verify whether you have a good strategy for tokens.

With a single table it's difficult to have this visibility due to other data entities being there.

That said, we still should support customers table that use a composite key -- I've got some unexpected meetings but haven't forgot about this yet, and it'll be part of the next release

Thank you for the patience @Tankanow

@heitorlessa,

This is such a good question; I'm delighted to discuss the cost implications! (Actually, my day job is to build an AWS cost intelligence tool on top of serverless tech). There are of many technical solutions here, though I think the key is to think in higher order KPIs. Some questions to ask are "What User Interactions relate to my idempotent queries/updates?" and "What Features, Customers, etc. do those User Interactions roll up to?" That way companies can tie the cost back to user value! The key for this library would be to ensure that clients can set the DynamoDB keys in such a way that they can easily do this analysis without full table scans.

@pull-request-size pull-request-size bot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Oct 13, 2021
@Tankanow Tankanow requested a review from heitorlessa October 13, 2021 21:55
@@ -64,10 +73,14 @@ def __init__(

self._boto_config = boto_config or Config()
self._boto3_session = boto3_session or boto3.session.Session()
if sort_key_attr == key_attr:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NOTE: I snuck in this extra check =).

@heitorlessa
Copy link
Contributor

I'll look into this right after merging the new Event Handle Router, so we can merge this later ;)

@to-mc
Copy link
Contributor

to-mc commented Oct 15, 2021

This looks good to me - small nitpicks only! Could we possibly change the name of key_attr_value to something more descriptive? Perhaps static_key_value? This could just be a symptom of Friday afternoon, but it took me a while to understand from the code that this was just allowing users to set a static value for the partition key when using a composite primary key.

Also, can we change the variable default to remove the "test-func" part. I understand that code is probably only exercised when testing (outside of a Lambda environment), but it feels wrong to have that in the "real" class. How about letting it be an empty string:

f"idempotency#{os.getenv(constants.LAMBDA_FUNCTION_NAME_ENV, '')}",

@Tankanow
Copy link
Contributor Author

Thanks for you patience @heitorlessa and @cakepietoast.

I made the f"idempotency#{os.getenv(constants.LAMBDA_FUNCTION_NAME_ENV, '')}", change.

Re static_key_value vs key_attr_value, I'm not sure the former is more descriptive; I like the latter because it mirrors the key_attr argument.

@to-mc
Copy link
Contributor

to-mc commented Oct 29, 2021

Thanks for you patience @heitorlessa and @cakepietoast.

I made the f"idempotency#{os.getenv(constants.LAMBDA_FUNCTION_NAME_ENV, '')}", change.

Re static_key_value vs key_attr_value, I'm not sure the former is more descriptive; I like the latter because it mirrors the key_attr argument.

Apologies for the late response on this one, I missed your reply. I understand your point that key_attr_value looks nice next to key_attr, but I have a hard time with the semantics. key_attr is the name of attribute to use as the partition key. partition_key_attr would have been a much better name, but its hard to change now without breaking compatibility - my mistake there is partly why I'm being a bit persistent on this topic 😄

"Key attribute value" sounds like we're setting the value of the "key attribute", which to my mind is saying the same thing as setting the "key attribute". What we're actually doing is setting a static value to use for the partition key, which is where my suggestion came from.

Do you feel any better about static_pk_value? If you don't mind changing that we'll get this merged and documented asap.

@heitorlessa heitorlessa removed their assignment Oct 29, 2021
@Tankanow
Copy link
Contributor Author

Tankanow commented Nov 1, 2021

Thanks for you patience @heitorlessa and @cakepietoast.
I made the f"idempotency#{os.getenv(constants.LAMBDA_FUNCTION_NAME_ENV, '')}", change.
Re static_key_value vs key_attr_value, I'm not sure the former is more descriptive; I like the latter because it mirrors the key_attr argument.

Apologies for the late response on this one, I missed your reply. I understand your point that key_attr_value looks nice next to key_attr, but I have a hard time with the semantics. key_attr is the name of attribute to use as the partition key. partition_key_attr would have been a much better name, but its hard to change now without breaking compatibility - my mistake there is partly why I'm being a bit persistent on this topic 😄

"Key attribute value" sounds like we're setting the value of the "key attribute", which to my mind is saying the same thing as setting the "key attribute". What we're actually doing is setting a static value to use for the partition key, which is where my suggestion came from.

Do you feel any better about static_pk_value? If you don't mind changing that we'll get this merged and documented asap.

I like static_pk_value better. Making the change ASAP.

@to-mc
Copy link
Contributor

to-mc commented Nov 3, 2021

I like static_pk_value better. Making the change ASAP.

Much appreciated!

@pull-request-size pull-request-size bot removed the size/S Denotes a PR that changes 10-29 lines, ignoring generated files. label Nov 3, 2021
@pull-request-size pull-request-size bot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Nov 3, 2021
@Tankanow
Copy link
Contributor Author

Tankanow commented Nov 3, 2021

I like static_pk_value better. Making the change ASAP.

Much appreciated!

Done. Ready for re-review.

@to-mc to-mc self-requested a review November 5, 2021 10:21
Copy link
Contributor

@to-mc to-mc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@to-mc to-mc merged commit 6c451f7 into aws-powertools:develop Nov 9, 2021
@heitorlessa heitorlessa changed the title improv: Support a composite primary key in DynamoDBPersistenceLayer feat(idempotency): support composite primary key in DynamoDBPersistenceLayer Nov 17, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature New feature or functionality size/M Denotes a PR that changes 30-99 lines, ignoring generated files. tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants