Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set default IMDS timeouts to match AWS SDK #10133

Merged
merged 7 commits into from
Oct 16, 2020
Merged

Conversation

tvoran
Copy link
Member

@tvoran tvoran commented Oct 12, 2020

Sets timeout and retries on the http client used for connecting to the local instance metadata service (IMDS) to match those in the AWS go SDK: aws/aws-sdk-go#3066

The main motivation for this patch is the slow startup when using auto-unseal on AWS since 1.4, ex:

This essentially shortens the time it takes for the aws sdk to fall back to v1 of the IMDS API, when vault is running in docker on an EC2 instance, and the hop limit on the underlying EC2 instance is set to 1:

Vault would be using those timeout and retry settings already, but the default http client is being overridden in GenerateCredentialChain():

// Add the remote provider
def := defaults.Get()
if c.Region != "" {
def.Config.Region = aws.String(c.Region)
}
if c.HTTPClient != nil {
def.Config.HTTPClient = c.HTTPClient
}
providers = append(providers, defaults.RemoteCredProvider(*def.Config, def.Handlers))

Which then causes this block in the SDK to be skipped:
https://github.com/aws/aws-sdk-go/blob/v1.30.27/aws/ec2metadata/service.go#L75-L87

func NewClient(cfg aws.Config, handlers request.Handlers, endpoint, signingRegion string, opts ...func(*client.Client)) *EC2Metadata {
	if !aws.BoolValue(cfg.EC2MetadataDisableTimeoutOverride) && httpClientZero(cfg.HTTPClient) {
		// If the http client is unmodified and this feature is not disabled
		// set custom timeouts for EC2Metadata requests.
		cfg.HTTPClient = &http.Client{
			// use a shorter timeout than default because the metadata
			// service is local if it is running, and to fail faster
			// if not running on an ec2 instance.
			Timeout: 1 * time.Second,
		}
		// max number of retries on the client operation
		cfg.MaxRetries = aws.Int(2)
	}

Note that this patch skips setting the defaults if either AWS_CONTAINER_CREDENTIALS_FULL_URI or AWS_CONTAINER_CREDENTIALS_RELATIVE_URI is present in the env, because they trigger a different codepath than NewClient(). Those environment variables are checked in the SDK in RemoteCredProvider():
https://github.com/aws/aws-sdk-go/blob/v1.30.27/aws/defaults/defaults.go#L120-L133

@tvoran tvoran requested a review from a team October 12, 2020 21:50
@calvn calvn added this to the 1.6 milestone Oct 14, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants