Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(sagemaker): add model hosting L2 constructs #20113

Closed
wants to merge 65 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
65 commits
Select commit Hold shift + click to select a range
c2b8d37
feat(sagemaker): add model hosting L2 constructs
petermeansrock Feb 3, 2020
a68ba16
add README changes missing from earlier commit
petermeansrock Feb 4, 2020
8a9494e
Merge branch 'master' into sagemaker-l2
petermeansrock Feb 24, 2020
9edc79d
Merge branch 'master' into sagemaker-l2
petermeansrock Feb 24, 2020
9c94dc9
Merge branch 'master' into sagemaker-l2
petermeansrock Feb 25, 2020
95b4d5a
Merge branch 'master' into sagemaker-l2
petermeansrock Feb 28, 2020
70a0b73
Merge branch 'master' into sagemaker-l2
petermeansrock Mar 3, 2020
d52be3c
Merge branch 'master' into sagemaker-l2
petermeansrock Mar 3, 2020
9afb024
Merge branch 'master' into sagemaker-l2
petermeansrock Mar 4, 2020
344894f
Merge branch 'master' into sagemaker-l2
petermeansrock Mar 4, 2020
a5236c0
Merge branch 'master' into sagemaker-l2
petermeansrock Mar 7, 2020
4aa4fab
Merge branch 'master' into sagemaker-l2
petermeansrock Mar 9, 2020
b3678d2
Merge branch 'master' into sagemaker-l2
petermeansrock Mar 11, 2020
d16127e
Merge branch 'master' into sagemaker-l2
petermeansrock Mar 11, 2020
83a3ac2
Merge branch 'master' into sagemaker-l2
petermeansrock Mar 11, 2020
fc25fdb
Merge branch 'master' into sagemaker-l2
petermeansrock Mar 11, 2020
f30c975
Merge branch 'master' into sagemaker-l2
petermeansrock Mar 12, 2020
9e5575d
Merge branch 'master' into sagemaker-l2
petermeansrock Apr 26, 2022
8351909
Adopt Jest naming convention
petermeansrock Apr 26, 2022
50fc173
Migrate from Nodeunit/assert to Jest/assertions
petermeansrock Apr 27, 2022
143aec5
Support integ test run on non-linux/amd64 machines
petermeansrock Apr 27, 2022
519415c
Migrate off deprecated APIs
petermeansrock Apr 27, 2022
531d06a
Update code to meet linting requirements
petermeansrock Apr 27, 2022
bdb3be0
Replace cdk-integ snapshots with integ-runner ones
petermeansrock Apr 28, 2022
8fdb2fd
Merge branch 'master' into sagemaker-l2
petermeansrock Apr 28, 2022
43afc42
Add Rosetta fixtures to fix README build issues
petermeansrock Apr 28, 2022
abfdc66
Merge branch 'master' into sagemaker-l2
petermeansrock May 11, 2022
ad51f1b
Remove extra newline from README
petermeansrock Jul 21, 2022
18ef532
Incorporate README file changes from RFC review
petermeansrock Jul 21, 2022
b979d23
Fix specification of defaults per RFC feedback
petermeansrock Jul 22, 2022
fe232e7
Simplify autoscaling documentation per RFC feedback
petermeansrock Aug 16, 2022
1dc57f8
Remove mention of endpoint from model VPC docs
petermeansrock Aug 16, 2022
bd77937
Eliminate uppercase abbreviations per RFC feedback
petermeansrock Aug 16, 2022
6c2b696
Simplify container/variant props per RFC feedback
petermeansrock Aug 16, 2022
c4dad9f
Bring container limit up-to-date with reality
petermeansrock Aug 17, 2022
2145924
Support lazy specification of model containers
petermeansrock Aug 16, 2022
e549cbf
Remove non-default Rosetta fixtures
petermeansrock Aug 25, 2022
93a8c85
Document EndpointConfig reuse across Endpoints
petermeansrock Sep 6, 2022
1df7d85
Drop scope/id specification in ContainerImage API
petermeansrock Sep 13, 2022
8612092
Drop scope/id specification in ModelData API
petermeansrock Sep 13, 2022
ed87db4
Distinguish instance-based variants
petermeansrock Sep 13, 2022
5e903fa
List supported instance types with InstanceType
petermeansrock Sep 14, 2022
1d681da
Distinguish instance-based variants for Endpoints
petermeansrock Sep 15, 2022
88a8708
Drop README mention of IEndpointProductionVariant
petermeansrock Sep 15, 2022
0ade09b
Use enum-like class for AcceleratorType
petermeansrock Sep 15, 2022
c7d27c9
Add return type to accelerator/instance type APIs
petermeansrock Sep 15, 2022
1f21d15
Fix missing mention of "instance" in docs
petermeansrock Sep 15, 2022
f670653
Merge branch 'master' into sagemaker-l2
petermeansrock Sep 20, 2022
3c36b11
Replace validate() hooks with node.addValidation()
petermeansrock Sep 20, 2022
049eec4
Fix burstable check after EC2 module refactor
petermeansrock Sep 20, 2022
867e9f5
Flip zip check as behavior changed on gzipped tars
petermeansrock Sep 20, 2022
28f0409
Specify ECR image tag as CDK no longer uses latest
petermeansrock Sep 21, 2022
371f756
Adopt IntegTest & rengerate snapshots
petermeansrock Sep 20, 2022
f8cd7db
Merge branch 'master' into sagemaker-l2
petermeansrock Sep 21, 2022
a7faff8
Verify endpoint behavior in integ test w/ API call
petermeansrock Sep 22, 2022
ebd2b0f
Merge branch 'master' into sagemaker-l2
petermeansrock Sep 22, 2022
588100a
Validate presence of at least one variant
petermeansrock Sep 22, 2022
9b1766d
Merge branch 'master' into sagemaker-l2
petermeansrock Sep 22, 2022
8640604
Merge branch 'master' into sagemaker-l2
petermeansrock Oct 3, 2022
4573d5b
Create assets behind ContainerImage.fromAsset API
petermeansrock Oct 3, 2022
381cb6f
Create assets behind ModelData.fromAsset API
petermeansrock Oct 3, 2022
74a8f3c
Trim README content in favor of links to AWS docs
petermeansrock Oct 4, 2022
406875f
Merge branch 'master' into sagemaker-l2
petermeansrock Oct 6, 2022
0f22439
Adjust EndpointConfig tests to emphasize synthesis
petermeansrock Oct 6, 2022
e0741ff
Adjust EndpointProps to take IEndpointConfig
petermeansrock Oct 6, 2022
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
239 changes: 227 additions & 12 deletions packages/@aws-cdk/aws-sagemaker/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,31 +9,246 @@
>
> [CFN Resources]: https://docs.aws.amazon.com/cdk/latest/guide/constructs.html#constructs_lib

![cdk-constructs: Experimental](https://img.shields.io/badge/cdk--constructs-experimental-important.svg?style=for-the-badge)

> The APIs of higher level constructs in this module are experimental and under active development.
> They are subject to non-backward compatible changes or removal in any future version. These are
> not subject to the [Semantic Versioning](https://semver.org/) model and breaking changes will be
> announced in the release notes. This means that while you may use them, you may need to update
> your source code when upgrading to a newer version of this package.

---

<!--END STABILITY BANNER-->

This module is part of the [AWS Cloud Development Kit](https://github.com/aws/aws-cdk) project.
Amazon SageMaker provides every developer and data scientist with the ability to build, train, and
deploy machine learning models quickly. Amazon SageMaker is a fully-managed service that covers the
entire machine learning workflow to label and prepare your data, choose an algorithm, train the
model, tune and optimize it for deployment, make predictions, and take action. Your models get to
production faster with much less effort and lower cost.

## Installation

Install the module:

```console
$ npm i @aws-cdk/aws-sagemaker
```

Import it into your code:

```typescript
import * as sagemaker from '@aws-cdk/aws-sagemaker';
```

## Model

To create a machine learning model with Amazon Sagemaker, use the `Model` construct. This construct
includes properties that can be configured to define model components, including the model inference
code as a Docker image and an optional set of separate model data artifacts. See the [AWS
documentation](https://docs.aws.amazon.com/sagemaker/latest/dg/sagemaker-marketplace-develop.html)
to learn more about SageMaker models.

### Single Container Model

In the event that a single container is sufficient for your inference use-case, you can define a
single-container model:

```ts nofixture
```typescript
import * as sagemaker from '@aws-cdk/aws-sagemaker';
import * as path from 'path';

const image = sagemaker.ContainerImage.fromAsset(path.join('path', 'to', 'Dockerfile', 'directory'));
const modelData = sagemaker.ModelData.fromAsset(path.join('path', 'to', 'artifact', 'file.tar.gz'));

const model = new sagemaker.Model(this, 'PrimaryContainerModel', {
containers: [
{
image: image,
modelData: modelData,
}
]
});
```

<!--BEGIN CFNONLY DISCLAIMER-->
### Inference Pipeline Model

There are no official hand-written ([L2](https://docs.aws.amazon.com/cdk/latest/guide/constructs.html#constructs_lib)) constructs for this service yet. Here are some suggestions on how to proceed:
An inference pipeline is an Amazon SageMaker model that is composed of a linear sequence of multiple
containers that process requests for inferences on data. See the [AWS
documentation](https://docs.aws.amazon.com/sagemaker/latest/dg/inference-pipelines.html) to learn
more about SageMaker inference pipelines. To define an inference pipeline, you can provide
additional containers for your model:

```typescript
import * as sagemaker from '@aws-cdk/aws-sagemaker';

declare const image1: sagemaker.ContainerImage;
declare const modelData1: sagemaker.ModelData;
declare const image2: sagemaker.ContainerImage;
declare const modelData2: sagemaker.ModelData;
declare const image3: sagemaker.ContainerImage;
declare const modelData3: sagemaker.ModelData;

const model = new sagemaker.Model(this, 'InferencePipelineModel', {
containers: [
{ image: image1, modelData: modelData1 },
{ image: image2, modelData: modelData2 },
{ image: image3, modelData: modelData3 }
],
});
```

- Search [Construct Hub for SageMaker construct libraries](https://constructs.dev/search?q=sagemaker)
- Use the automatically generated [L1](https://docs.aws.amazon.com/cdk/latest/guide/constructs.html#constructs_l1_using) constructs, in the same way you would use [the CloudFormation AWS::SageMaker resources](https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/AWS_SageMaker.html) directly.
### Container Images

Inference code can be stored in the Amazon EC2 Container Registry (Amazon ECR), which is specified
via `ContainerDefinition`'s `image` property which accepts a class that extends the `ContainerImage`
abstract base class.

<!--BEGIN CFNONLY DISCLAIMER-->
#### Asset Image

There are no hand-written ([L2](https://docs.aws.amazon.com/cdk/latest/guide/constructs.html#constructs_lib)) constructs for this service yet.
However, you can still use the automatically generated [L1](https://docs.aws.amazon.com/cdk/latest/guide/constructs.html#constructs_l1_using) constructs, and use this service exactly as you would using CloudFormation directly.
Reference a local directory containing a Dockerfile:

For more information on the resources and properties available for this service, see the [CloudFormation documentation for AWS::SageMaker](https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/AWS_SageMaker.html).
```typescript
import * as sagemaker from '@aws-cdk/aws-sagemaker';
import * as path from 'path';

const image = sagemaker.ContainerImage.fromAsset(path.join('path', 'to', 'Dockerfile', 'directory'));
```

#### ECR Image

Reference an image available within ECR:

```typescript
import * as ecr from '@aws-cdk/aws-ecr';
import * as sagemaker from '@aws-cdk/aws-sagemaker';

const repository = ecr.Repository.fromRepositoryName(this, 'Repository', 'repo');
const image = sagemaker.ContainerImage.fromEcrRepository(repository, 'tag');
```

### Model Artifacts

If you choose to decouple your model artifacts from your inference code (as is natural given
different rates of change between inference code and model artifacts), the artifacts can be
specified via the `modelData` property which accepts a class that extends the `ModelData` abstract
base class. The default is to have no model artifacts associated with a model.

#### Asset Model Data

Reference local model data:

```typescript
import * as sagemaker from '@aws-cdk/aws-sagemaker';
import * as path from 'path';

const modelData = sagemaker.ModelData.fromAsset(path.join('path', 'to', 'artifact', 'file.tar.gz'));
```

(Read the [CDK Contributing Guide](https://github.com/aws/aws-cdk/blob/main/CONTRIBUTING.md) and submit an RFC if you are interested in contributing to this construct library.)
#### S3 Model Data

<!--END CFNONLY DISCLAIMER-->
Reference an S3 bucket and object key as the artifacts for a model:

```typescript
import * as s3 from '@aws-cdk/aws-s3';
import * as sagemaker from '@aws-cdk/aws-sagemaker';

const bucket = new s3.Bucket(this, 'MyBucket');
const modelData = sagemaker.ModelData.fromBucket(bucket, 'path/to/artifact/file.tar.gz');
```

## Model Hosting

Amazon SageMaker provides model hosting services for model deployment. Amazon SageMaker provides an
HTTPS endpoint where your machine learning model is available to provide inferences.

### Endpoint Configuration

By using the `EndpointConfig` construct, you can define a set of endpoint configuration which can be
used to provision one or more endpoints. In this configuration, you identify one or more models to
deploy and the resources that you want Amazon SageMaker to provision. You define one or more
production variants, each of which identifies a model. Each production variant also describes the
resources that you want Amazon SageMaker to provision. If you are hosting multiple models, you also
assign a variant weight to specify how much traffic you want to allocate to each model. For example,
suppose that you want to host two models, A and B, and you assign traffic weight 2 for model A and 1
for model B. Amazon SageMaker distributes two-thirds of the traffic to Model A, and one-third to
model B:

```typescript
import * as sagemaker from '@aws-cdk/aws-sagemaker';

declare const modelA: sagemaker.Model;
declare const modelB: sagemaker.Model;

const endpointConfig = new sagemaker.EndpointConfig(this, 'EndpointConfig', {
instanceProductionVariants: [
{
model: modelA,
variantName: 'modelA',
initialVariantWeight: 2.0,
},
{
model: modelB,
variantName: 'variantB',
initialVariantWeight: 1.0,
},
]
});
```

### Endpoint

When you create an endpoint from an `EndpointConfig`, Amazon SageMaker launches the ML compute
instances and deploys the model or models as specified in the configuration. To get inferences from
the model, client applications send requests to the Amazon SageMaker Runtime HTTPS endpoint. For
more information about the API, see the
[InvokeEndpoint](https://docs.aws.amazon.com/sagemaker/latest/dg/API_runtime_InvokeEndpoint.html)
API. Defining an endpoint requires at minimum the associated endpoint configuration:

```typescript
import * as sagemaker from '@aws-cdk/aws-sagemaker';

declare const endpointConfig: sagemaker.EndpointConfig;

const endpoint = new sagemaker.Endpoint(this, 'Endpoint', { endpointConfig });
```

### AutoScaling

To enable autoscaling on the production variant, use the `autoScaleInstanceCount` method:

```typescript
import * as sagemaker from '@aws-cdk/aws-sagemaker';

declare const endpointConfig: sagemaker.EndpointConfig;

const endpoint = new sagemaker.Endpoint(this, 'Endpoint', { endpointConfig });
const productionVariant = endpoint.findInstanceProductionVariant('variantName');
const instanceCount = productionVariant.autoScaleInstanceCount({
maxCapacity: 3
});
instanceCount.scaleOnInvocations('LimitRPS', {
maxRequestsPerSecond: 30,
});
```

For load testing guidance on determining the maximum requests per second per instance, please see
this [documentation](https://docs.aws.amazon.com/sagemaker/latest/dg/endpoint-scaling-loadtest.html).

### Metrics

To monitor CloudWatch metrics for a production variant, use one or more of the metric convenience
methods:

```typescript
import * as sagemaker from '@aws-cdk/aws-sagemaker';

declare const endpointConfig: sagemaker.EndpointConfig;

const endpoint = new sagemaker.Endpoint(this, 'Endpoint', { endpointConfig });
const productionVariant = endpoint.findInstanceProductionVariant('variantName');
productionVariant.metricModelLatency().createAlarm(this, 'ModelLatencyAlarm', {
threshold: 100000,
evaluationPeriods: 3,
});
```
64 changes: 64 additions & 0 deletions packages/@aws-cdk/aws-sagemaker/lib/accelerator-type.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
import * as cdk from '@aws-cdk/core';

/**
* Supported Elastic Inference (EI) instance types for SageMaker instance-based production variants.
* EI instances provide on-demand GPU computing for inference.
*/
export class AcceleratorType {
/**
* ml.eia1.large
*/
public static readonly EIA1_LARGE = AcceleratorType.of('ml.eia1.large');

/**
* ml.eia1.medium
*/
public static readonly EIA1_MEDIUM = AcceleratorType.of('ml.eia1.medium');

/**
* ml.eia1.xlarge
*/
public static readonly EIA1_XLARGE = AcceleratorType.of('ml.eia1.xlarge');

/**
* ml.eia2.large
*/
public static readonly EIA2_LARGE = AcceleratorType.of('ml.eia2.large');

/**
* ml.eia2.medium
*/
public static readonly EIA2_MEDIUM = AcceleratorType.of('ml.eia2.medium');

/**
* ml.eia2.xlarge
*/
public static readonly EIA2_XLARGE = AcceleratorType.of('ml.eia2.xlarge');

/**
* Builds an AcceleratorType from a given string or token (such as a CfnParameter).
* @param acceleratorType An accelerator type as string
* @returns A strongly typed AcceleratorType
*/
public static of(acceleratorType: string): AcceleratorType {
return new AcceleratorType(acceleratorType);
}

private readonly acceleratorTypeIdentifier: string;

constructor(acceleratorType: string) {
if (cdk.Token.isUnresolved(acceleratorType) || acceleratorType.startsWith('ml.')) {
this.acceleratorTypeIdentifier = acceleratorType;
} else {
throw new Error(`instance type must start with 'ml.'; (got ${acceleratorType})`);
}
}

/**
* Return the accelerator type as a string
* @returns The accelerator type as a string
*/
public toString(): string {
return this.acceleratorTypeIdentifier;
}
}
Loading