Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sagemaker: Support multi-model endpoints #23154

Open
1 of 2 tasks
petermeansrock opened this issue Nov 29, 2022 · 0 comments
Open
1 of 2 tasks

sagemaker: Support multi-model endpoints #23154

petermeansrock opened this issue Nov 29, 2022 · 0 comments
Labels
@aws-cdk/aws-sagemaker Related to AWS SageMaker effort/medium Medium work item – several days of effort feature-request A feature should be added or improved. p3

Comments

@petermeansrock
Copy link
Contributor

petermeansrock commented Nov 29, 2022

Describe the feature

As described in the SageMaker Endpoint L2 construct RFC:

Multi-Model Endpoints: By default (and as described in the technical solution above), SageMaker expects the model data URL on each container to point to an S3 object containing a gzipped tar file of artifacts, which will be automatically extracted upon instance provisioning. To support colocation of multiple logical models into a single container, the Mode attribute was added to the ContainerDefinition CloudFormation structure to either explicit configure SingleModel mode (the default) or MultiModel mode. In multi-model mode, SageMaker now expects the customer configured model data URL to point to an S3 path under which multiple gzipped tar files exist. When invoking a multi-model endpoint, the client invoking the endpoint must specify the target model representing the exact S3 path suffix pointing to a specific gzipped tar file.

Please 👍 this issue to help with the prioritization of this feature.

Use Case

Colocating multiple models behind a single endpoint can help customers save money as described here.

Proposed Solution

As described in the SageMaker Endpoint L2 construct RFC:

To accommodate this feature, the proposed ModelData.fromAsset API should be adjusted to support zip file assets capable of containing one or more gzipped tar files within them. Even though the code need not be aware of .tar.gz files specifically, it might prove a better customer experience to at least put up guard rails to prevent zip file assets from being used in single model mode where as multi-model mode could be more permissive.

Other Information

No response

Acknowledgements

  • I may be able to implement this feature request
  • This feature might incur a breaking change

CDK version used

2.54.0-alpha.0

Environment details (OS name and version, etc.)

macOS Ventura

@petermeansrock petermeansrock added feature-request A feature should be added or improved. needs-triage This issue or PR still needs to be triaged. labels Nov 29, 2022
@github-actions github-actions bot added the @aws-cdk/aws-sagemaker Related to AWS SageMaker label Nov 29, 2022
@peterwoodworth peterwoodworth added p2 effort/medium Medium work item – several days of effort and removed needs-triage This issue or PR still needs to be triaged. labels Nov 29, 2022
@madeline-k madeline-k removed their assignment Oct 30, 2023
@pahud pahud added p3 and removed p2 labels Jun 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
@aws-cdk/aws-sagemaker Related to AWS SageMaker effort/medium Medium work item – several days of effort feature-request A feature should be added or improved. p3
Projects
None yet
Development

No branches or pull requests

4 participants