diff --git a/README.md b/README.md index 33a9491..c09ac99 100644 --- a/README.md +++ b/README.md @@ -377,10 +377,23 @@ This section describes security controls and best practices implemented by the s #### Preventive We use an IAM role policy which enforce usage of specific security controls. For example, all SageMaker workloads must be created in the VPC with specified security groups and subnets: ```json -"Condition": { - "Null": { - "sagemaker:VpcSecurityGroupIds": "true" - } +{ + "Condition": { + "Null": { + "sagemaker:VpcSubnets": "true" + } + }, + "Action": [ + "sagemaker:CreateNotebookInstance", + "sagemaker:CreateHyperParameterTuningJob", + "sagemaker:CreateProcessingJob", + "sagemaker:CreateTrainingJob", + "sagemaker:CreateModel" + ], + "Resource": [ + "arn:aws:sagemaker:*::*" + ], + "Effect": "Deny" } ``` [List of IAM policy conditions for Amazon SageMaker](https://docs.aws.amazon.com/service-authorization/latest/reference/list_amazonsagemaker.html) @@ -692,6 +705,95 @@ aws sts get-caller-identity All operations are performed under the SageMaker execution role. +## Test preventive IAM policies +Try to start a training job without VPC attachment: +```python +container_uri = sagemaker.image_uris.retrieve(region=session.region_name, + framework='xgboost', + version='1.0-1', + image_scope='training') + +xgb = sagemaker.estimator.Estimator(image_uri=container_uri, + role=sagemaker_execution_role, + instance_count=2, + instance_type='ml.m5.xlarge', + output_path='s3://{}/{}/model-artifacts'.format(default_bucket, prefix), + sagemaker_session=sagemaker_session, + base_job_name='reorder-classifier', + subnets=network_config.subnets, + security_group_ids=network_config.security_group_ids, + encrypt_inter_container_traffic=network_config.encrypt_inter_container_traffic, + enable_network_isolation=network_config.enable_network_isolation, + volume_kms_key=ebs_kms_id, + output_kms_key=s3_kms_id + ) + +xgb.set_hyperparameters(objective='binary:logistic', + num_round=100) + +xgb.fit({'train': train_set_pointer, 'validation': validation_set_pointer}) +``` + + +You will get `AccessDeniedException` because of the explicit `Deny` in the IAM policy: + +![start-training-job-without-vpc](img/start-training-job-without-vpc.png) +![accessdeniedexception](img/accessdeniedexception.png) + +IAM policy: +```json +{ + "Condition": { + "Null": { + "sagemaker:VpcSubnets": "true", + "sagemaker:VpcSecurityGroup": "true" + } + }, + "Action": [ + "sagemaker:CreateNotebookInstance", + "sagemaker:CreateHyperParameterTuningJob", + "sagemaker:CreateProcessingJob", + "sagemaker:CreateTrainingJob", + "sagemaker:CreateModel" + ], + "Resource": [ + "arn:aws:sagemaker:*::*" + ], + "Effect": "Deny" +} +``` + +Now add the secure network configuration to the `Estimator`: +```python +network_config = NetworkConfig( + enable_network_isolation=False, + security_group_ids=env_data["SecurityGroups"], + subnets=env_data["SubnetIds"], + encrypt_inter_container_traffic=True) +``` + +```python +xgb = sagemaker.estimator.Estimator( + image_uri=container_uri, + role=sagemaker_execution_role, + instance_count=2, + instance_type='ml.m5.xlarge', + output_path='s3://{}/{}/model-artifacts'.format(default_bucket, prefix), + sagemaker_session=sagemaker_session, + base_job_name='reorder-classifier', + + subnets=network_config.subnets, + security_group_ids=network_config.security_group_ids, + encrypt_inter_container_traffic=network_config.encrypt_inter_container_traffic, + enable_network_isolation=network_config.enable_network_isolation, + volume_kms_key=ebs_kms_id, + output_kms_key=s3_kms_id + + ) +``` + +You will be able to create and run the training job + # Deployment ## Pre-requisites @@ -1115,6 +1217,8 @@ Second, do the steps from **Clean-up considerations** section. - [R25]: [Machine learning best practices in financial services](https://aws.amazon.com/blogs/machine-learning/machine-learning-best-practices-in-financial-services/) - [R26]: [Machine Learning Best Practices in Financial Services](https://d1.awsstatic.com/whitepapers/machine-learning-in-financial-services-on-aws.pdf) - [R27]: [Dynamic A/B testing for machine learning models with Amazon SageMaker MLOps projects](https://aws.amazon.com/blogs/machine-learning/dynamic-a-b-testing-for-machine-learning-models-with-amazon-sagemaker-mlops-projects/) +- [R28]: [Hosting a private PyPI server for Amazon SageMaker Studio notebooks in a VPC](https://aws.amazon.com/blogs/machine-learning/hosting-a-private-pypi-server-for-amazon-sagemaker-studio-notebooks-in-a-vpc/) +- [R29]: [Automate a centralized deployment of Amazon SageMaker Studio with AWS Service Catalog](https://aws.amazon.com/blogs/machine-learning/automate-a-centralized-deployment-of-amazon-sagemaker-studio-with-aws-service-catalog/) ## AWS Solutions - [SOL1]: [AWS MLOps Framework](https://aws.amazon.com/solutions/implementations/aws-mlops-framework/) diff --git a/cfn_templates/env-iam.yaml b/cfn_templates/env-iam.yaml index e2d9472..66f0a04 100644 --- a/cfn_templates/env-iam.yaml +++ b/cfn_templates/env-iam.yaml @@ -366,6 +366,7 @@ Resources: - 'sagemaker:CreateHyperParameterTuningJob' - 'sagemaker:CreateProcessingJob' - 'sagemaker:CreateTrainingJob' + - 'sagemaker:CreateCompilationJob' - 'sagemaker:CreateModel' Resource: - !Sub 'arn:aws:sagemaker:*:${AWS::AccountId}:*' diff --git a/img/accessdeniedexception.png b/img/accessdeniedexception.png new file mode 100644 index 0000000..922dd4c Binary files /dev/null and b/img/accessdeniedexception.png differ diff --git a/img/start-training-job-without-vpc.png b/img/start-training-job-without-vpc.png new file mode 100644 index 0000000..6415594 Binary files /dev/null and b/img/start-training-job-without-vpc.png differ diff --git a/package-cfn.md b/package-cfn.md index 7d1db0f..5db976d 100644 --- a/package-cfn.md +++ b/package-cfn.md @@ -45,14 +45,14 @@ aws cloudformation describe-stacks \ 📜 **Save it to your scratch pad for later use.** -7. Check that the deployment templates are uploaded into the S3 bucket: +6. Check that the deployment templates are uploaded into the S3 bucket: ```sh aws s3 ls s3://${S3_BUCKET_NAME}/sagemaker-mlops/ --recursive ``` ![upoaded-cfn-templates-ls](img/upoaded-cfn-templates-ls.png) -Now all deployment CloudFormation templates are packaged and uploaded to your S3 bucket. You can proceed with [further deployment steps](README.md#Deployment). You can check that by running the following command: +Now all deployment CloudFormation templates are packaged and uploaded to your S3 bucket. You can proceed with [further deployment steps](README.md#Deployment). ## Option 2 - use a shell script to package and upload If you use macOS/Linux you can run the delivered packaging script via `make` command. This script **will not** work on Windows. diff --git a/package-cfn.yaml b/package-cfn.yaml index 13c6ca6..924b68f 100644 --- a/package-cfn.yaml +++ b/package-cfn.yaml @@ -32,7 +32,7 @@ Outputs: Value: !Sub 'https://console.aws.amazon.com/cloudformation/home?region=${AWS::Region}#/stacks/new?templateURL=https://s3.${AWS::Region}.amazonaws.com/${S3BucketName}/sagemaker-mlops/data-science-environment-quickstart.yaml' StartBuildCLICommand: - Description: Link to open CloudFormation with core infrastructure stack + Description: CLI to start CodeBuild build Value: !Sub 'aws codebuild start-build --project-name ${CfnTemplatePackageProject}' Resources: