Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to retrieve IAM credentials when launching sagemaker container #341

Closed
edgBR opened this issue Nov 27, 2020 · 10 comments
Closed

Unable to retrieve IAM credentials when launching sagemaker container #341

edgBR opened this issue Nov 27, 2020 · 10 comments
Labels
bug 🐞 Something isn't working

Comments

@edgBR
Copy link

edgBR commented Nov 27, 2020

Dear colleagues,

2 months ago we managed to finish our R MVP using sagemaker. We were able to run preprocessing, training and inference and also able to inject multiple parameters into the container using step functions.

We are having now the following issue in preprocessing:

When calling the following function:

## SECRETS MANAGER ------

getSecrets <- function(region, secrets_name) {
  secrets_client <- paws::secretsmanager(config =
                                           list(
                                             region = region
                                           )
  )
  get_secret_value_response <- secrets_client$get_secret_value(
    SecretId=secrets_name
  )
  secrets_str <- get_secret_value_response$SecretString %>% fromJSON()
  return(secrets_str)
}


tryCatch(
  {
    print("Reading Secrets for Snowflake Credentials")
    print(paste0("is this ec2? ",  aws.ec2metadata::is_ec2()))
    print(paste0("is this ecs?", aws.ec2metadata::is_ecs()))
    secrets_name <- Sys.getenv("SECRET_NAME", unset = "mysecrets")
    aws_default_region <- Sys.getenv("AWS_REGION", unset = 'eu-central-1')
    secrets_dict <- getSecrets(region = aws_default_region, secrets_name = secrets_name)
  },
  error = function(e) {
    message(e)
  }
)

We are not getting the following error:

[1] "Reading Secrets for Snowflake Credentials"
[1] "is this ec2? FALSE"
[1] "is this ecs? TRUE"
No credentials providedNo credentials providedobject 'secrets_dict' not foundError in f() : No credentials provided
Calls: getSecrets ... sign_with_body -> get_credentials -> call_with_args -> f
Execution halted

This a very extrange issue because 3 things:

  1. When running this into an EC2 with the same role than the step function and the container everything works perfectly.
  2. We have roll back the container and code but we are having the same problem.
  3. We are running also some python containers with the same step function that are able to retrieve the secrets (so it shouldnt be a permission issue)

My renv regarding paws looks as follows:

{
  "R": {
    "Version": "4.0.3",
    "Repositories": [
      {
        "Name": "CRAN",
        "URL": "https://cloud.r-project.org"
      }
    ]
  },
  "Packages": {
  "paws": {
      "Package": "paws",
      "Version": "0.1.9",
      "Source": "Repository",
      "Repository": "CRAN",
      "Hash": "cc1fd0294714f21c6d643c9dc29470ce"
    },
    "paws.analytics": {
      "Package": "paws.analytics",
      "Version": "0.1.9",
      "Source": "Repository",
      "Repository": "CRAN",
      "Hash": "28ba03b328d5a537054d527133b1e6c6"
    },
    "paws.application.integration": {
      "Package": "paws.application.integration",
      "Version": "0.1.9",
      "Source": "Repository",
      "Repository": "CRAN",
      "Hash": "7983719a1e9123872a6df51967ba6d98"
    },
    "paws.common": {
      "Package": "paws.common",
      "Version": "0.3.5",
      "Source": "Repository",
      "Repository": "CRAN",
      "Hash": "a4ea6004935b167e26fa0bbcba549cad"
    },
    "paws.compute": {
      "Package": "paws.compute",
      "Version": "0.1.9",
      "Source": "Repository",
      "Repository": "CRAN",
      "Hash": "048db92a80ade37db2a166181a22f5bb"
    },
    "paws.cost.management": {
      "Package": "paws.cost.management",
      "Version": "0.1.9",
      "Source": "Repository",
      "Repository": "CRAN",
      "Hash": "eb703944c3a369b1d67e2d9242684d13"
    },
    "paws.customer.engagement": {
      "Package": "paws.customer.engagement",
      "Version": "0.1.9",
      "Source": "Repository",
      "Repository": "CRAN",
      "Hash": "87caa37f6c03cdd92786098857b23580"
    },
    "paws.database": {
      "Package": "paws.database",
      "Version": "0.1.9",
      "Source": "Repository",
      "Repository": "CRAN",
      "Hash": "c6ba18123b5d6143d824a6132b809d26"
    },
    "paws.machine.learning": {
      "Package": "paws.machine.learning",
      "Version": "0.1.9",
      "Source": "Repository",
      "Repository": "CRAN",
      "Hash": "d2d20828dff45fe0b61aafdddc55ee49"
    },
    "paws.management": {
      "Package": "paws.management",
      "Version": "0.1.9",
      "Source": "Repository",
      "Repository": "CRAN",
      "Hash": "7f933bf67730c02432b56ce5e961fb46"
    },
    "paws.networking": {
      "Package": "paws.networking",
      "Version": "0.1.9",
      "Source": "Repository",
      "Repository": "CRAN",
      "Hash": "355569a323b12203ec8507c072c067e0"
    },
    "paws.security.identity": {
      "Package": "paws.security.identity",
      "Version": "0.1.9",
      "Source": "Repository",
      "Repository": "CRAN",
      "Hash": "7d7fc6f968c20d94195dd6d031e70af0"
    },
    "paws.storage": {
      "Package": "paws.storage",
      "Version": "0.1.9",
      "Source": "Repository",
      "Repository": "CRAN",
      "Hash": "23d613188f5d8f62b51ab455f83b51c3"
    }
}

Any clue of why this is happening?

BR
/Edgar

@edgBR
Copy link
Author

edgBR commented Nov 27, 2020

I have mapped the dependencies of the IAM retrieval to:

paws.common/R/client.R
paws.common/R/credentials.R
paws.common/R/credential_providers.R
paws.common/R/credential_sts.R
paws.common/R/net.R
paws.common/R/struct.R
paws.common/R/url.R

Update: following on my investigations I have found that

paws.common-v0.3.4...paws.common-v0.3.5

credential_providers had a consistent update in the last release. Maybe the issue locates there? it seems that I will have an interesting weekend :)

@davidkretch
Copy link
Member

Sorry about that! We'll look in the next few days.

It sounds like there's a bug in the getting ECS container credentials.

I don't think that the paws.common 0.3.5 update would have affected this, unless this worked before. The 0.3.5 update added a new way of getting credentials from a config file but I don't believe it should have affected this case.

@davidkretch davidkretch added the bug 🐞 Something isn't working label Nov 27, 2020
@abhilashc299
Copy link

Yeah, having the same issue. Can't seem to get IAM role anymore

@davidkretch
Copy link
Member

Thanks. Are you also having the problem in containers running under SageMaker, or in some other context?

@abhilashc299
Copy link

Yeah, I am using Sagemaker processing. I have two containers, one with code written in R using paws and other is in Python using boto3. No issues with Python but looking for a workaround with R now

@davidkretch
Copy link
Member

davidkretch commented Nov 30, 2020

Thank you. If either of you have time, could you tell me if:

  1. Using paws.common 0.3.4 works with IAM credentials? You can install it with devtools::install_version("paws.common", "0.3.4").
  2. Getting IAM credentials works in other contexts, e.g. a SageMaker notebook?

@edgBR
Copy link
Author

edgBR commented Nov 30, 2020

Hi @davidkretch

Getting the credentials when running this in an EC2 instance with the same IAM role attached works perfectly.

I have tested with 0.3.4 and its not working but let me post the logs tomorrow when I am back in office.

I have made a partial fix by following these steps:

  1. Install AWS CLI in my container.
  2. Building a wrapper to the AWS CLI secrets manager function that I need.
  3. Use system2() to run the AWS CLI commands.
  4. Assign console output to a variable.
  5. Conver this to JSON, subset the secretsResponse.
  6. Use the values of the secrets in my script.

But of course this is far from optimal and I would like to keep my container dependencies at minimum and also do not have to maintain any AWS CLI wrappers if I can use paws.

BR
/E

@davidkretch
Copy link
Member

davidkretch commented Dec 2, 2020

Hello, thanks for the info. We've been working on this but so far been unsuccessful at reproducing it. What we've tried so far is us-east-1 and eu-central-1, with Docker images based on Ubuntu 16, R 4.0, with and without renv, all run through SageMaker Processing.

When you have time, could you provide us with

  1. the base image(s) you are using for your container (e.g. ubuntu:xx.xx)
  2. full list of R packages installed in the container image
  3. whether some/all other Paws operations fail similarly
  4. whether aws.ec2metadata::metadata$iam_info() returns anything (don't post the return value here)
  5. whether paws.common:::get_container_credentials() or paws.common:::get_iam_role() return anything (don't post return values)

Thank you!

@davidkretch
Copy link
Member

Hello, we're closing this for now. Please let us know if this is still an issue and we'll do our best!

@ncullen93
Copy link

I am unfortunately having this issue on sagemaker inference, but only on the serverless inference. I am deploying a model using the standard vetiver functions for doing so (ref: https://juliasilge.com/blog/vetiver-sagemaker/), along with some slight changes to the config to be serverless. The vetiver deployment works perfectly with real-time inference but when I change to serverless, it fails because paws can't find any credentials.

I wonder if there is anything special that should be done when building a docker for serverless inference, or if this is just a paws issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug 🐞 Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants