Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New Pattern: aws-lambda-sagemakerendpoint #111

Closed
tabdunabi opened this issue Dec 16, 2020 · 0 comments · Fixed by #112
Closed

New Pattern: aws-lambda-sagemakerendpoint #111

tabdunabi opened this issue Dec 16, 2020 · 0 comments · Fixed by #112
Assignees
Labels
feature-request A feature should be added or improved in-progress This issue is being actively worked on

Comments

@tabdunabi
Copy link
Contributor

The serverless architecture (API Gateway – Lambda - Sagemaker endpoint) is a common pattern
to deploy real-time ML inference microservices in production. Most ML models require data pre-processing and feature engineering, that cannot be easily implemented using the API gateway's request mapping templates. In some cases, raw data also needs to be enriched using reference datasets. This is where the proposed lambda-sagemakerendpoint construct is a better choice than the existing aws-apigateway-sagemakerendpoint construct. The Lambda provides the compute power to perform any required data pre-processing, feature engineering, and/or data enrichment. The proposed lambda-sagemakerendpoint can be combined with the existing apigateway-lambda construct to create apigateway-lambda-sagemakerendpoint.

Use cases

  • Real-time synchronous inference: the proposed lambda-sagemakerendpoint, combined with aws-apigateway, can be used for real-time synchronous inference, where requests from a client applications are forwarded to SageMaker endpoint, and predictions, produced by the endpoint’s model, are returned in the responses to the client application.
  • Real-time asynchronous inference: The proposed lambda-sagemakerendpoint pattern can be used in real-time asynchronous inference by integrating it with a messaging service, such as AWS SQS. This pattern is suitable for high-throughput applications, where API gateway limits, such as number of requests/second, payload size, etc., or heavy data pre-processing, create a bottleneck in the ML inference pipeline.

Proposed solution
The proposed patterns provide the following:

  • Implementation to use an existing SageMaker inference endpoint, or create a new one.
  • Build required roles permissions (e.g., Lambda to invoke SageMaker endpoint, logs, X-ray, SageMaker service role to create SageMaker resources, etc.).
  • VPC configuration (i.e., creating a new VPC, or using existing VPC) to be used by the SageMaker endpoint and Lambda.
@hnishar hnishar changed the title New Pattern: lambda-sagemakerendpoint New Pattern: aws-lambda-sagemakerendpoint Dec 16, 2020
@hnishar hnishar added feature-request A feature should be added or improved in-progress This issue is being actively worked on labels Dec 16, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature-request A feature should be added or improved in-progress This issue is being actively worked on
Projects
None yet
2 participants