Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: enable additional metadata collection (under feature flag) #33232

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

GavinZZ
Copy link
Contributor

@GavinZZ GavinZZ commented Jan 30, 2025

Issue # (if applicable)

#33198

Note

The majority of the code changes are auto generated so you'll see hundreds of addConstructMetadata method call across different L2 resources.

This method comes from this change https://github.com/aws/aws-cdk/pull/33232/files#diff-81f821b1205e7040fc3103bf7c0114060a6d5c43ebd2994aa4ed5906e42c9c5fR33. The main code change that needs to be reviewed is in packages/aws-cdk-lib/core as well as tools/@aws-cdk/construct-metadata-updater

Reason for this change

This discussion aims to expand the scope of usage data collected by the AWS CDK to better inform CDK development and improve communication for customer-impacting topics. Currently, for those that opt in, the CDK collects usage data on your CDK version and which L2 constructs you use.

Description of changes

  1. Update CDK synthesis code to additionally handle resource metadata. On feature flag set to true, synthesis will not only inject Metadata usage like version and construct name, it will additionally look for any construct/method/feature flag metadata injected during resource creation. On feature flag set to false, it should be the same as before.
  2. One-time tool metadata-updater to automatically find the right classes and add import statements and add metadata statements. The tool can be run multiple times and should not add additional import or add metadata statements to files that already been added. An action item is to link the tool to a GHA to periodically run this.
  3. Build a workflow (that will be linked to GHA in the future) and when redacting, redact based on the value. The workflow on run will parse all files in aws-cdk repository and built a JSON file that contains all constructs and loggable properties of the construct. When redacting, only log the property key if the key exists in the JSON file. The value will be logged only if the value is not a * in the JSON file. Everything else is redacted for safely.
  4. Build a JSON blueprint of the ENUM values and do not redact ENUM values.

Consider the following example

import * as cdk from 'aws-cdk-lib';
import { Construct } from 'constructs';
import * as s3 from 'aws-cdk-lib/aws-s3';

class MyStack extends cdk.Stack {
  constructor(scope, id, props) {
    super(scope, id, props);

    // Create an S3 bucket (L2 construct)
    const myBucket = new s3.Bucket(this, 'MyBucket', {
      bucketName: 'my-cdk-example-bucket', // String type
      versioned: true,                    // Boolean type
      removalPolicy: cdk.RemovalPolicy.DESTROY, // ENUM type
      lifecycleRules: [{                  // Array of object type
        expirationDate: new Date('2019-10-01'),
        objectSizeLessThan: 600,
        objectSizeGreaterThan: 500,
      }],
    });

    // Use a method of the L2 construct to define additional properties
    myBucket.addLifecycleRule({
      id: 'ExpireOldObjects',
      enabled: true, // Boolean
      expiration: cdk.Duration.days(90), // Expire objects after 90 days
    });
  }
}

// Define the CDK app and stack
const app = new cdk.App();
new MyStack(app, 'MyStack');
app.synth();

At synthesis, usage data is collected, compressed, and stored in the AWS::CDK::Metadata resource. Based on current behavior, the following is an example of the usage data that will be collected from our example app:

{ 
    "fqn": "aws-cdk-lib.aws-s3.Bucket", 
    "version": "v2.170.0" 
}

With this proposal, the following usage data will be collected. The * value replaces property values that will be redacted from data collection:

{ 
    "fqn": "aws-cdk-lib.aws_s3.Bucket", 
    "version": "2.170.0", 
    "metadata": [ 
        { 
            "type": "aws:cdk:analytics:construct", 
            "data": { 
                "bucketName": "*", 
                "versioned": true, 
                "removalPolicy": "cdk.RemovalPolicy.DESTROY", 
                "lifecycleRules": [ 
                    { 
                        "expirationDate": "*", 
                        "objectSizeLessThan": "*", 
                        "objectSizeGreaterThan": "*" 
                    } 
                ] 
            } 
        }, 
        { 
            "type": "aws:cdk:analytics:method", 
            "data": { 
                "name": "addLifecycleRule", 
                "prop": { 
                    "id": "*", 
                    "enabled": true, 
                    "expiration": "*",
                } 
            } 
        } 
    ] 
}

Describe any new or updated permissions being added

No

Description of how you validated changes

Many new unit tests added to verify different behaviour of various functions and methods introduced. One integ test file is added to test the deployability.

Checklist


By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license

@GavinZZ GavinZZ requested a review from a team as a code owner January 30, 2025 03:14
@aws-cdk-automation aws-cdk-automation requested a review from a team January 30, 2025 03:14
@github-actions github-actions bot added the p2 label Jan 30, 2025
@mergify mergify bot added the contribution/core This is a PR that came from AWS. label Jan 30, 2025
Comment on lines +228 to +229
// Enhanced CDK Analytics Telemetry
addConstructMetadata(this, props);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The majority of the code changes are auto generated so you'll see hundreds of addConstructMetadata method call across different L2 resources.

This method comes from this change https://github.com/aws/aws-cdk/pull/33232/files#diff-81f821b1205e7040fc3103bf7c0114060a6d5c43ebd2994aa4ed5906e42c9c5fR33. The main code change that needs to be reviewed is in packages/aws-cdk-lib/core as well as tools/@aws-cdk/construct-metadata-updater

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We only support collecting metadata from the constructor for now .. right ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR only supports constructor collection. I'll work on the method collections but I don't think that blocks us reviewing and merging this PR.

@aws-cdk-automation aws-cdk-automation added the pr/needs-maintainer-review This PR needs a review from a Core Team Member label Jan 30, 2025
@moelasmar
Copy link
Contributor

are you going to add the GH workflow in a separate PR ?

@GavinZZ
Copy link
Contributor Author

GavinZZ commented Jan 31, 2025

are you going to add the GH workflow in a separate PR ?

Yes, it's going to be a separate PR.

Copy link
Contributor

@xazhao xazhao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me overall.

*/
export function redactMetadata(fqn: string, data: any): any {
// A valid fqn is consists of 3 parts, i.e. `aws-cdk-lib.aws-lambda.Function`.
const fqnParts = fqn.replace(/[-_]/g, '-').split('.');
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: is there an example we need to replace _ with -?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, for example, IAM's fqn is something like "fqn": "aws-cdk-lib.aws_iam.CfnRole",

@aws-cdk-automation
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: AutoBuildv2Project1C6BFA3F-wQm2hXv2jqQv
  • Commit ID: d547015
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@GavinZZ GavinZZ added the pr-linter/exempt-codecov The PR linter will not require codecov checks to pass label Jan 31, 2025
@GavinZZ
Copy link
Contributor Author

GavinZZ commented Jan 31, 2025

Added exempt-codecov because we added addConstructMetadata statements to a number of L2 constructs. We can't exhaustively test all changes and the code coverage drop is expected.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
contribution/core This is a PR that came from AWS. p2 pr/needs-maintainer-review This PR needs a review from a Core Team Member pr-linter/exempt-codecov The PR linter will not require codecov checks to pass
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants