Skip to content

Commit

Permalink
Add the vertex endpoint resource. (#6583)
Browse files Browse the repository at this point in the history
* Add the vertex endpoint resource.

* Add a KMS key to the test.

* Add an iam member for encypter decrypter role on kms key.

* Use a bootstrapped test network.

* Make bootstrapped network a data source.

* Use long form for self link.

* Add property overrides.

* Make accelerator_type a string.
  • Loading branch information
trodge authored Oct 3, 2022
1 parent 3b8ba06 commit 1750a07
Show file tree
Hide file tree
Showing 3 changed files with 270 additions and 0 deletions.
210 changes: 210 additions & 0 deletions mmv1/products/vertexai/api.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -96,13 +96,222 @@ objects:
description: |
Required. The Cloud KMS resource identifier of the customer managed encryption key used to protect a resource.
Has the form: projects/my-project/locations/my-region/keyRings/my-kr/cryptoKeys/my-key. The key needs to be in the same region as where the resource is created.
required: true
input: true
- !ruby/object:Api::Type::String
name: 'metadataSchemaUri'
required: true
input: true
description: |
Points to a YAML file stored on Google Cloud Storage describing additional information about the Dataset. The schema is defined as an OpenAPI 3.0.2 Schema Object. The schema files that can be used here are found in gs://google-cloud-aiplatform/schema/dataset/metadata/.
# Vertex AI Endpoints
- !ruby/object:Api::Resource
name: Endpoint
base_url: projects/{{project}}/locations/{{location}}/endpoints
create_url: projects/{{project}}/locations/{{location}}/endpoints
self_link: 'projects/{{project}}/locations/{{location}}/endpoints/{{name}}'
update_verb: :PATCH
update_mask: true
references: !ruby/object:Api::Resource::ReferenceLinks
guides:
'Official Documentation':
'https://cloud.google.com/vertex-ai/docs'
api: 'https://cloud.google.com/vertex-ai/docs/reference/rest/v1beta1/projects.locations.endpoints'
async: !ruby/object:Api::OpAsync
operation: !ruby/object:Api::OpAsync::Operation
path: 'name'
base_url: '{{op_id}}'
wait_ms: 1000
result: !ruby/object:Api::OpAsync::Result
path: 'response'
resource_inside_response: true
status: !ruby/object:Api::OpAsync::Status
path: 'done'
complete: True
allowed:
- True
- False
error: !ruby/object:Api::OpAsync::Error
path: 'error'
message: 'message'
description: "Models are deployed into it, and afterwards Endpoint is called to obtain predictions and explanations."
parameters:
- !ruby/object:Api::Type::String
name: location
description: The location for the resource
url_param_only: true
input: true
properties:
- !ruby/object:Api::Type::String
name: name
description: Output only. The resource name of the Endpoint.
output: true
- !ruby/object:Api::Type::String
name: displayName
description: Required. The display name of the Endpoint. The name can be up to 128 characters long and can consist of any UTF-8 characters.
required: true
- !ruby/object:Api::Type::String
name: description
description: The description of the Endpoint.
- !ruby/object:Api::Type::Array
name: deployedModels
description: Output only. The models deployed in this Endpoint. To add or remove DeployedModels use EndpointService.DeployModel and EndpointService.UndeployModel respectively.
output: true
item_type: !ruby/object:Api::Type::NestedObject
name: deployedModels
description: Output only. The models deployed in this Endpoint. To add or remove DeployedModels use EndpointService.DeployModel and EndpointService.UndeployModel respectively.
properties:
- !ruby/object:Api::Type::NestedObject
name: dedicatedResources
description: A description of resources that are dedicated to the DeployedModel, and that need a higher degree of manual configuration.
output: true
properties:
- !ruby/object:Api::Type::NestedObject
name: machineSpec
description: The specification of a single machine used by the prediction.
output: true
properties:
- !ruby/object:Api::Type::String
name: machineType
description: 'The type of the machine. See the [list of machine types supported for prediction](https://cloud.google.com/vertex-ai/docs/predictions/configure-compute#machine-types) See the [list of machine types supported for custom training](https://cloud.google.com/vertex-ai/docs/training/configure-compute#machine-types). For DeployedModel this field is optional, and the default value is `n1-standard-2`. For BatchPredictionJob or as part of WorkerPoolSpec this field is required. TODO(rsurowka): Try to better unify the required vs optional.'
output: true
- !ruby/object:Api::Type::String
name: acceleratorType
description: The type of accelerator(s) that may be attached to the machine as per accelerator_count. See possible values [here](https://cloud.google.com/vertex-ai/docs/reference/rest/v1/MachineSpec#AcceleratorType).
output: true
- !ruby/object:Api::Type::Integer
name: acceleratorCount
description: The number of accelerators to attach to the machine.
output: true
- !ruby/object:Api::Type::Integer
name: minReplicaCount
description: The minimum number of machine replicas this DeployedModel will be always deployed on. This value must be greater than or equal to 1. If traffic against the DeployedModel increases, it may dynamically be deployed onto more replicas, and as traffic decreases, some of these extra replicas may be freed.
output: true
- !ruby/object:Api::Type::Integer
name: maxReplicaCount
description: The maximum number of replicas this DeployedModel may be deployed on when the traffic against it increases. If the requested value is too large, the deployment will error, but if deployment succeeds then the ability to scale the model to that many replicas is guaranteed (barring service outages). If traffic against the DeployedModel increases beyond what its replicas at maximum may handle, a portion of the traffic will be dropped. If this value is not provided, will use min_replica_count as the default value. The value of this field impacts the charge against Vertex CPU and GPU quotas. Specifically, you will be charged for max_replica_count * number of cores in the selected machine type) and (max_replica_count * number of GPUs per replica in the selected machine type).
output: true
- !ruby/object:Api::Type::Array
name: autoscalingMetricSpecs
description: The metric specifications that overrides a resource utilization metric (CPU utilization, accelerator's duty cycle, and so on) target value (default to 60 if not set). At most one entry is allowed per metric. If machine_spec.accelerator_count is above 0, the autoscaling will be based on both CPU utilization and accelerator's duty cycle metrics and scale up when either metrics exceeds its target value while scale down if both metrics are under their target value. The default target value is 60 for both metrics. If machine_spec.accelerator_count is 0, the autoscaling will be based on CPU utilization metric only with default target value 60 if not explicitly set. For example, in the case of Online Prediction, if you want to override target CPU utilization to 80, you should set autoscaling_metric_specs.metric_name to `aiplatform.googleapis.com/prediction/online/cpu/utilization` and autoscaling_metric_specs.target to `80`.
output: true
item_type: !ruby/object:Api::Type::NestedObject
name: autoscalingMetricSpecs
description: The metric specifications that overrides a resource utilization metric (CPU utilization, accelerator's duty cycle, and so on) target value (default to 60 if not set). At most one entry is allowed per metric. If machine_spec.accelerator_count is above 0, the autoscaling will be based on both CPU utilization and accelerator's duty cycle metrics and scale up when either metrics exceeds its target value while scale down if both metrics are under their target value. The default target value is 60 for both metrics. If machine_spec.accelerator_count is 0, the autoscaling will be based on CPU utilization metric only with default target value 60 if not explicitly set. For example, in the case of Online Prediction, if you want to override target CPU utilization to 80, you should set autoscaling_metric_specs.metric_name to `aiplatform.googleapis.com/prediction/online/cpu/utilization` and autoscaling_metric_specs.target to `80`.
properties:
- !ruby/object:Api::Type::String
name: metricName
description: 'The resource metric name. Supported metrics: * For Online Prediction: * `aiplatform.googleapis.com/prediction/online/accelerator/duty_cycle` * `aiplatform.googleapis.com/prediction/online/cpu/utilization`'
output: true
- !ruby/object:Api::Type::Integer
name: target
description: The target resource utilization in percentage (1% - 100%) for the given metric; once the real usage deviates from the target by a certain percentage, the machine replicas change. The default value is 60 (representing 60%) if not provided.
output: true
- !ruby/object:Api::Type::NestedObject
name: automaticResources
description: A description of resources that to large degree are decided by Vertex AI, and require only a modest additional configuration.
output: true
properties:
- !ruby/object:Api::Type::Integer
name: minReplicaCount
description: The minimum number of replicas this DeployedModel will be always deployed on. If traffic against it increases, it may dynamically be deployed onto more replicas up to max_replica_count, and as traffic decreases, some of these extra replicas may be freed. If the requested value is too large, the deployment will error.
output: true
- !ruby/object:Api::Type::Integer
name: maxReplicaCount
description: The maximum number of replicas this DeployedModel may be deployed on when the traffic against it increases. If the requested value is too large, the deployment will error, but if deployment succeeds then the ability to scale the model to that many replicas is guaranteed (barring service outages). If traffic against the DeployedModel increases beyond what its replicas at maximum may handle, a portion of the traffic will be dropped. If this value is not provided, a no upper bound for scaling under heavy traffic will be assume, though Vertex AI may be unable to scale beyond certain replica number.
output: true
- !ruby/object:Api::Type::String
name: id
description: The ID of the DeployedModel. If not provided upon deployment, Vertex AI will generate a value for this ID. This value should be 1-10 characters, and valid characters are /[0-9]/.
output: true
- !ruby/object:Api::Type::String
name: model
description: The name of the Model that this is the deployment of. Note that the Model may be in a different location than the DeployedModel's Endpoint.
output: true
- !ruby/object:Api::Type::String
name: modelVersionId
description: Output only. The version ID of the model that is deployed.
output: true
- !ruby/object:Api::Type::String
name: displayName
description: The display name of the DeployedModel. If not provided upon creation, the Model's display_name is used.
output: true
- !ruby/object:Api::Type::String
name: createTime
description: Output only. Timestamp when the DeployedModel was created.
output: true
- !ruby/object:Api::Type::String
name: serviceAccount
description: The service account that the DeployedModel's container runs as. Specify the email address of the service account. If this service account is not specified, the container runs as a service account that doesn't have access to the resource project. Users deploying the Model must have the `iam.serviceAccounts.actAs` permission on this service account.
output: true
- !ruby/object:Api::Type::Boolean
name: enableAccessLogging
description: These logs are like standard server access logs, containing information like timestamp and latency for each prediction request. Note that Stackdriver logs may incur a cost, especially if your project receives prediction requests at a high queries per second rate (QPS). Estimate your costs before enabling this option.
output: true
- !ruby/object:Api::Type::NestedObject
name: privateEndpoints
description: Output only. Provide paths for users to send predict/explain/health requests directly to the deployed model services running on Cloud via private services access. This field is populated if network is configured.
output: true
properties:
- !ruby/object:Api::Type::String
name: predictHttpUri
description: Output only. Http(s) path to send prediction requests.
output: true
- !ruby/object:Api::Type::String
name: explainHttpUri
description: Output only. Http(s) path to send explain requests.
output: true
- !ruby/object:Api::Type::String
name: healthHttpUri
description: Output only. Http(s) path to send health check requests.
output: true
- !ruby/object:Api::Type::String
name: serviceAttachment
description: Output only. The name of the service attachment resource. Populated if private service connect is enabled.
output: true
- !ruby/object:Api::Type::String
name: sharedResources
description: 'The resource name of the shared DeploymentResourcePool to deploy on. Format: projects/{project}/locations/{location}/deploymentResourcePools/{deployment_resource_pool}'
output: true
- !ruby/object:Api::Type::Boolean
name: enableContainerLogging
description: If true, the container of the DeployedModel instances will send `stderr` and `stdout` streams to Stackdriver Logging. Only supported for custom-trained Models and AutoML Tabular Models.
output: true
- !ruby/object:Api::Type::String
name: etag
description: Used to perform consistent read-modify-write updates. If not set, a blind "overwrite" update happens.
output: true
- !ruby/object:Api::Type::KeyValuePairs
name: labels
description: The labels with user-defined metadata to organize your Endpoints. Label keys and values can be no longer than 64 characters (Unicode codepoints), can only contain lowercase letters, numeric characters, underscores and dashes. International characters are allowed. See https://goo.gl/xmQnxf for more information and examples of labels.
- !ruby/object:Api::Type::String
name: createTime
description: Output only. Timestamp when this Endpoint was created.
output: true
- !ruby/object:Api::Type::String
name: updateTime
description: Output only. Timestamp when this Endpoint was last updated.
output: true
- !ruby/object:Api::Type::NestedObject
name: encryptionSpec
description: Customer-managed encryption key spec for an Endpoint. If set, this Endpoint and all sub-resources of this Endpoint will be secured by this key.
input: true
properties:
- !ruby/object:Api::Type::String
name: kmsKeyName
description: 'Required. The Cloud KMS resource identifier of the customer managed encryption key used to protect a resource. Has the form: `projects/my-project/locations/my-region/keyRings/my-kr/cryptoKeys/my-key`. The key needs to be in the same region as where the compute resource is created.'
required: true
input: true
- !ruby/object:Api::Type::String
name: network
description: 'The full name of the Google Compute Engine [network](https://cloud.google.com//compute/docs/networks-and-firewalls#networks) to which the Endpoint should be peered. Private services access must already be configured for the network. If left unspecified, the Endpoint is not peered with any network. Only one of the fields, network or enable_private_service_connect, can be set. [Format](https://cloud.google.com/compute/docs/reference/rest/v1/networks/insert): `projects/{project}/global/networks/{network}`. Where `{project}` is a project number, as in `12345`, and `{network}` is network name.'
input: true
- !ruby/object:Api::Type::String
name: modelDeploymentMonitoringJob
description: 'Output only. Resource name of the Model Monitoring job associated with this Endpoint if monitoring is enabled by CreateModelDeploymentMonitoringJob. Format: `projects/{project}/locations/{location}/modelDeploymentMonitoringJobs/{model_deployment_monitoring_job}`'
output: true

# Vertex AI Featurestores
- !ruby/object:Api::Resource
name: Featurestore
Expand Down Expand Up @@ -422,6 +631,7 @@ objects:
description: |
Required. The Cloud KMS resource identifier of the customer managed encryption key used to protect a resource.
Has the form: projects/my-project/locations/my-region/keyRings/my-kr/cryptoKeys/my-key. The key needs to be in the same region as where the resource is created.
required: true
input: true
- !ruby/object:Api::Type::NestedObject
name: 'state'
Expand Down
19 changes: 19 additions & 0 deletions mmv1/products/vertexai/terraform.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,25 @@ overrides: !ruby/object:Overrides::ResourceOverrides
default_from_api: true
region: !ruby/object:Overrides::Terraform::PropertyOverride
default_from_api: true
Endpoint: !ruby/object:Overrides::Terraform::ResourceOverride
examples:
- !ruby/object:Provider::Terraform::Examples
name: "vertex_ai_endpoint_network"
primary_resource_id: "endpoint"
vars:
name: "vertex_ai_endpoint"
project: "vertex-ai"
address_name: "address-name"
kms_key_name: "kms-name"
network_name: "network-name"
test_vars_overrides:
kms_key_name: 'BootstrapKMSKeyInLocation(t, "us-central1").CryptoKey.Name'
network_name: 'BootstrapSharedTestNetwork(t, "vertex")'
properties:
etag: !ruby/object:Overrides::Terraform::PropertyOverride
ignore_read: true
name: !ruby/object:Overrides::Terraform::PropertyOverride
custom_flatten: templates/terraform/custom_flatten/name_from_self_link.erb
Featurestore: !ruby/object:Overrides::Terraform::ResourceOverride
autogen_async: false
skip_sweeper: true
Expand Down
Loading

0 comments on commit 1750a07

Please sign in to comment.