Skip to content

Commit

Permalink
Add transformNameMapping to dataflow job (#3964) (#7259)
Browse files Browse the repository at this point in the history
* initial commit - wip

* tests updated and doc added

* example header added to doc

* description updated for transform_name_mapping

Signed-off-by: Modular Magician <[email protected]>
  • Loading branch information
modular-magician authored Sep 14, 2020
1 parent 960ce26 commit a37a42e
Show file tree
Hide file tree
Showing 4 changed files with 48 additions and 4 deletions.
3 changes: 3 additions & 0 deletions .changelog/3964.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
```release-note:enhancement
dataflow : added `transformnameMapping` to `google_dataflow_job`
```
16 changes: 12 additions & 4 deletions google/resource_dataflow_job.go
Original file line number Diff line number Diff line change
Expand Up @@ -111,6 +111,12 @@ func resourceDataflowJob() *schema.Resource {
Description: `User labels to be specified for the job. Keys and values should follow the restrictions specified in the labeling restrictions page. NOTE: Google-provided Dataflow templates often provide default labels that begin with goog-dataflow-provided. Unless explicitly set in config, these labels will be ignored to prevent diffs on re-apply.`,
},

"transform_name_mapping": {
Type: schema.TypeMap,
Optional: true,
Description: `Only applicable when updating a pipeline. Map of transform name prefixes of the job to be replaced with the corresponding name prefixes of the new job.`,
},

"on_delete": {
Type: schema.TypeString,
ValidateFunc: validation.StringInSlice([]string{"cancel", "drain"}, false),
Expand Down Expand Up @@ -314,17 +320,19 @@ func resourceDataflowJobUpdateByReplacement(d *schema.ResourceData, meta interfa
}

params := expandStringMap(d, "parameters")
tnamemapping := expandStringMap(d, "transform_name_mapping")

env, err := resourceDataflowJobSetupEnv(d, config)
if err != nil {
return err
}

request := dataflow.LaunchTemplateParameters{
JobName: d.Get("name").(string),
Parameters: params,
Environment: &env,
Update: true,
JobName: d.Get("name").(string),
Parameters: params,
TransformNameMapping: tnamemapping,
Environment: &env,
Update: true,
}

var response *dataflow.LaunchTemplateResponse
Expand Down
4 changes: 4 additions & 0 deletions google/resource_dataflow_job_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -827,6 +827,10 @@ resource "google_dataflow_job" "pubsub_stream" {
inputFilePattern = "${google_storage_bucket.bucket1.url}/*.json"
outputTopic = google_pubsub_topic.topic.id
}
transform_name_mapping = {
name = "test_job"
env = "test"
}
on_delete = "cancel"
}
`, suffix, suffix, suffix, suffix, testDataflowJobTemplateTextToPubsub, tempLocation)
Expand Down
29 changes: 29 additions & 0 deletions website/docs/r/dataflow_job.html.markdown
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,34 @@ resource "google_dataflow_job" "big_data_job" {
}
}
```
## Example Usage - Streaming Job
```hcl
resource "google_pubsub_topic" "topic" {
name = "dataflow-job1"
}
resource "google_storage_bucket" "bucket1" {
name = "tf-test-bucket1"
force_destroy = true
}
resource "google_storage_bucket" "bucket2" {
name = "tf-test-bucket2"
force_destroy = true
}
resource "google_dataflow_job" "pubsub_stream" {
name = "tf-test-dataflow-job1"
template_gcs_path = "gs://my-bucket/templates/template_file"
temp_gcs_location = "gs://my-bucket/tmp_dir"
parameters = {
inputFilePattern = "${google_storage_bucket.bucket1.url}/*.json"
outputTopic = google_pubsub_topic.topic.id
}
transform_name_mapping = {
name = "test_job"
env = "test"
}
on_delete = "cancel"
}
```

## Note on "destroy" / "apply"
There are many types of Dataflow jobs. Some Dataflow jobs run constantly, getting new data from (e.g.) a GCS bucket, and outputting data continuously. Some jobs process a set amount of data then terminate. All jobs can fail while running due to programming errors or other issues. In this way, Dataflow jobs are different from most other Terraform / Google resources.
Expand All @@ -49,6 +77,7 @@ The following arguments are supported:
specified in the [labeling restrictions](https://cloud.google.com/compute/docs/labeling-resources#restrictions) page.
**NOTE**: Google-provided Dataflow templates often provide default labels that begin with `goog-dataflow-provided`.
Unless explicitly set in config, these labels will be ignored to prevent diffs on re-apply.
* `transform_name_mapping` - (Optional) Only applicable when updating a pipeline. Map of transform name prefixes of the job to be replaced with the corresponding name prefixes of the new job. This field is not used outside of update.
* `max_workers` - (Optional) The number of workers permitted to work on the job. More workers may improve processing speed at additional cost.
* `on_delete` - (Optional) One of "drain" or "cancel". Specifies behavior of deletion during `terraform destroy`. See above note.
* `project` - (Optional) The project in which the resource belongs. If it is not provided, the provider project is used.
Expand Down

0 comments on commit a37a42e

Please sign in to comment.