Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(sagemaker): add Model L2 construct #22549

Merged
merged 15 commits into from
Nov 9, 2022
Merged
Show file tree
Hide file tree
Changes from 14 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
143 changes: 131 additions & 12 deletions packages/@aws-cdk/aws-sagemaker/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,31 +9,150 @@
>
> [CFN Resources]: https://docs.aws.amazon.com/cdk/latest/guide/constructs.html#constructs_lib

![cdk-constructs: Experimental](https://img.shields.io/badge/cdk--constructs-experimental-important.svg?style=for-the-badge)

> The APIs of higher level constructs in this module are experimental and under active development.
> They are subject to non-backward compatible changes or removal in any future version. These are
> not subject to the [Semantic Versioning](https://semver.org/) model and breaking changes will be
> announced in the release notes. This means that while you may use them, you may need to update
> your source code when upgrading to a newer version of this package.

---

<!--END STABILITY BANNER-->

This module is part of the [AWS Cloud Development Kit](https://github.com/aws/aws-cdk) project.
Amazon SageMaker provides every developer and data scientist with the ability to build, train, and
deploy machine learning models quickly. Amazon SageMaker is a fully-managed service that covers the
entire machine learning workflow to label and prepare your data, choose an algorithm, train the
model, tune and optimize it for deployment, make predictions, and take action. Your models get to
production faster with much less effort and lower cost.

## Installation

Install the module:

```console
$ npm i @aws-cdk/aws-sagemaker
```

Import it into your code:

```typescript
import * as sagemaker from '@aws-cdk/aws-sagemaker';
```

## Model

To create a machine learning model with Amazon Sagemaker, use the `Model` construct. This construct
includes properties that can be configured to define model components, including the model inference
code as a Docker image and an optional set of separate model data artifacts. See the [AWS
documentation](https://docs.aws.amazon.com/sagemaker/latest/dg/sagemaker-marketplace-develop.html)
to learn more about SageMaker models.

### Single Container Model

```ts nofixture
In the event that a single container is sufficient for your inference use-case, you can define a
single-container model:

```typescript
import * as sagemaker from '@aws-cdk/aws-sagemaker';
import * as path from 'path';

const image = sagemaker.ContainerImage.fromAsset(path.join('path', 'to', 'Dockerfile', 'directory'));
const modelData = sagemaker.ModelData.fromAsset(path.join('path', 'to', 'artifact', 'file.tar.gz'));

const model = new sagemaker.Model(this, 'PrimaryContainerModel', {
containers: [
{
image: image,
modelData: modelData,
}
]
});
```

<!--BEGIN CFNONLY DISCLAIMER-->
### Inference Pipeline Model

An inference pipeline is an Amazon SageMaker model that is composed of a linear sequence of multiple
containers that process requests for inferences on data. See the [AWS
documentation](https://docs.aws.amazon.com/sagemaker/latest/dg/inference-pipelines.html) to learn
more about SageMaker inference pipelines. To define an inference pipeline, you can provide
additional containers for your model:

There are no official hand-written ([L2](https://docs.aws.amazon.com/cdk/latest/guide/constructs.html#constructs_lib)) constructs for this service yet. Here are some suggestions on how to proceed:
```typescript
import * as sagemaker from '@aws-cdk/aws-sagemaker';

- Search [Construct Hub for SageMaker construct libraries](https://constructs.dev/search?q=sagemaker)
- Use the automatically generated [L1](https://docs.aws.amazon.com/cdk/latest/guide/constructs.html#constructs_l1_using) constructs, in the same way you would use [the CloudFormation AWS::SageMaker resources](https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/AWS_SageMaker.html) directly.
declare const image1: sagemaker.ContainerImage;
declare const modelData1: sagemaker.ModelData;
declare const image2: sagemaker.ContainerImage;
declare const modelData2: sagemaker.ModelData;
declare const image3: sagemaker.ContainerImage;
declare const modelData3: sagemaker.ModelData;

const model = new sagemaker.Model(this, 'InferencePipelineModel', {
containers: [
{ image: image1, modelData: modelData1 },
{ image: image2, modelData: modelData2 },
{ image: image3, modelData: modelData3 }
],
});
```

### Container Images

<!--BEGIN CFNONLY DISCLAIMER-->
Inference code can be stored in the Amazon EC2 Container Registry (Amazon ECR), which is specified
via `ContainerDefinition`'s `image` property which accepts a class that extends the `ContainerImage`
abstract base class.

There are no hand-written ([L2](https://docs.aws.amazon.com/cdk/latest/guide/constructs.html#constructs_lib)) constructs for this service yet.
However, you can still use the automatically generated [L1](https://docs.aws.amazon.com/cdk/latest/guide/constructs.html#constructs_l1_using) constructs, and use this service exactly as you would using CloudFormation directly.
#### Asset Image

For more information on the resources and properties available for this service, see the [CloudFormation documentation for AWS::SageMaker](https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/AWS_SageMaker.html).
Reference a local directory containing a Dockerfile:

(Read the [CDK Contributing Guide](https://github.com/aws/aws-cdk/blob/main/CONTRIBUTING.md) and submit an RFC if you are interested in contributing to this construct library.)
```typescript
import * as sagemaker from '@aws-cdk/aws-sagemaker';
import * as path from 'path';

const image = sagemaker.ContainerImage.fromAsset(path.join('path', 'to', 'Dockerfile', 'directory'));
```

<!--END CFNONLY DISCLAIMER-->
#### ECR Image

Reference an image available within ECR:

```typescript
import * as ecr from '@aws-cdk/aws-ecr';
import * as sagemaker from '@aws-cdk/aws-sagemaker';

const repository = ecr.Repository.fromRepositoryName(this, 'Repository', 'repo');
const image = sagemaker.ContainerImage.fromEcrRepository(repository, 'tag');
```

### Model Artifacts

If you choose to decouple your model artifacts from your inference code (as is natural given
different rates of change between inference code and model artifacts), the artifacts can be
specified via the `modelData` property which accepts a class that extends the `ModelData` abstract
base class. The default is to have no model artifacts associated with a model.

#### Asset Model Data

Reference local model data:

```typescript
import * as sagemaker from '@aws-cdk/aws-sagemaker';
import * as path from 'path';

const modelData = sagemaker.ModelData.fromAsset(path.join('path', 'to', 'artifact', 'file.tar.gz'));
```

#### S3 Model Data

Reference an S3 bucket and object key as the artifacts for a model:

```typescript
import * as s3 from '@aws-cdk/aws-s3';
import * as sagemaker from '@aws-cdk/aws-sagemaker';

const bucket = new s3.Bucket(this, 'MyBucket');
const modelData = sagemaker.ModelData.fromBucket(bucket, 'path/to/artifact/file.tar.gz');
kaizencc marked this conversation as resolved.
Show resolved Hide resolved
```
83 changes: 83 additions & 0 deletions packages/@aws-cdk/aws-sagemaker/lib/container-image.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,83 @@
import * as ecr from '@aws-cdk/aws-ecr';
import * as assets from '@aws-cdk/aws-ecr-assets';
import { Construct } from 'constructs';
import { Model } from './model';
import { hashcode } from './private/util';

/**
* The configuration for creating a container image.
*/
export interface ContainerImageConfig {
/**
* The image name. Images in Amazon ECR repositories can be specified by either using the full registry/repository:tag or
* registry/repository@digest.
*
* For example, `012345678910.dkr.ecr.<region-name>.amazonaws.com/<repository-name>:latest` or
* `012345678910.dkr.ecr.<region-name>.amazonaws.com/<repository-name>@sha256:94afd1f2e64d908bc90dbca0035a5b567EXAMPLE`.
*/
readonly imageName: string;
}

/**
* Constructs for types of container images
*/
export abstract class ContainerImage {
/**
* Reference an image in an ECR repository
*/
public static fromEcrRepository(repository: ecr.IRepository, tag: string = 'latest'): ContainerImage {
return new EcrImage(repository, tag);
}

/**
* Reference an image that's constructed directly from sources on disk
* @param directory The directory where the Dockerfile is stored
* @param options The options to further configure the selected image
*/
public static fromAsset(directory: string, options: assets.DockerImageAssetOptions = {}): ContainerImage {
return new AssetImage(directory, options);
}

/**
* Called when the image is used by a Model
*/
public abstract bind(scope: Construct, model: Model): ContainerImageConfig;
}

class EcrImage extends ContainerImage {
constructor(private readonly repository: ecr.IRepository, private readonly tag: string) {
super();
}

public bind(_scope: Construct, model: Model): ContainerImageConfig {
this.repository.grantPull(model);

return {
imageName: this.repository.repositoryUriForTag(this.tag),
};
}
}

class AssetImage extends ContainerImage {
private asset?: assets.DockerImageAsset;

constructor(private readonly directory: string, private readonly options: assets.DockerImageAssetOptions = {}) {
super();
}

public bind(scope: Construct, model: Model): ContainerImageConfig {
// Retain the first instantiation of this asset
if (!this.asset) {
this.asset = new assets.DockerImageAsset(scope, `ModelImage${hashcode(this.directory)}`, {
directory: this.directory,
...this.options,
});
}

this.asset.repository.grantPull(model);

return {
imageName: this.asset.imageUri,
};
}
}
4 changes: 4 additions & 0 deletions packages/@aws-cdk/aws-sagemaker/lib/index.ts
Original file line number Diff line number Diff line change
@@ -1,2 +1,6 @@
export * from './container-image';
export * from './model';
export * from './model-data';

// AWS::SageMaker CloudFormation Resources:
export * from './sagemaker.generated';
93 changes: 93 additions & 0 deletions packages/@aws-cdk/aws-sagemaker/lib/model-data.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,93 @@
import * as s3 from '@aws-cdk/aws-s3';
import * as assets from '@aws-cdk/aws-s3-assets';
import { Construct } from 'constructs';
import { IModel } from './model';
import { hashcode } from './private/util';

// The only supported extension for local asset model data
kaizencc marked this conversation as resolved.
Show resolved Hide resolved
// https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-properties-sagemaker-model-containerdefinition.html#cfn-sagemaker-model-containerdefinition-modeldataurl
const ARTIFACT_EXTENSION = '.tar.gz';

/**
* The configuration needed to reference model artifacts.
*/
export interface ModelDataConfig {
/**
* The S3 path where the model artifacts, which result from model training, are stored. This path
* must point to a single gzip compressed tar archive (.tar.gz suffix).
*/
readonly uri: string;
}

/**
* Model data represents the source of model artifacts, which will ultimately be loaded from an S3
* location.
*/
export abstract class ModelData {
/**
* Constructs model data which is already available within S3.
* @param bucket The S3 bucket within which the model artifacts are stored
* @param objectKey The S3 object key at which the model artifacts are stored
*/
public static fromBucket(bucket: s3.IBucket, objectKey: string): ModelData {
return new S3ModelData(bucket, objectKey);
}

/**
* Constructs model data that will be uploaded to S3 as part of the CDK app deployment.
* @param path The local path to a model artifact file as a gzipped tar file
* @param options The options to further configure the selected asset
*/
public static fromAsset(path: string, options: assets.AssetOptions = {}): ModelData {
return new AssetModelData(path, options);
}

/**
* This method is invoked by the SageMaker Model construct when it needs to resolve the model
* data to a URI.
* @param scope The scope within which the model data is resolved
* @param model The Model construct performing the URI resolution
*/
public abstract bind(scope: Construct, model: IModel): ModelDataConfig;
}

class S3ModelData extends ModelData {
constructor(private readonly bucket: s3.IBucket, private readonly objectKey: string) {
super();
}

public bind(_scope: Construct, model: IModel): ModelDataConfig {
this.bucket.grantRead(model);

return {
uri: this.bucket.urlForObject(this.objectKey),
};
}
}

class AssetModelData extends ModelData {
private asset?: assets.Asset;

constructor(private readonly path: string, private readonly options: assets.AssetOptions) {
super();
if (!path.toLowerCase().endsWith(ARTIFACT_EXTENSION)) {
throw new Error(`Asset must be a gzipped tar file with extension ${ARTIFACT_EXTENSION} (${this.path})`);
}
}

public bind(scope: Construct, model: IModel): ModelDataConfig {
// Retain the first instantiation of this asset
if (!this.asset) {
this.asset = new assets.Asset(scope, `ModelData${hashcode(this.path)}`, {
path: this.path,
...this.options,
});
}

this.asset.grantRead(model);

return {
uri: this.asset.httpUrl,
};
}
}
Loading