Skip to content

Commit

Permalink
Update BigQuery timeout + retry configs for v1.1 (#1289)
Browse files Browse the repository at this point in the history
* Update bigquery-profile.md

update docs according the changes in dbt-labs/dbt-bigquery#50

* Update upgrading-to-1-0-0.md

* Add versioning logic. Edit

* Update migration guides

* PR feedback, corrections

Co-authored-by: Hui Zheng <[email protected]>
  • Loading branch information
jtcohen6 and hui-zheng authored Apr 7, 2022
1 parent 523d9d7 commit ef8014e
Show file tree
Hide file tree
Showing 2 changed files with 93 additions and 2 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -36,4 +36,5 @@ _Note: If you're contributing docs for a new or updated feature in v1.1, please

### Plugins

- **dbt-bigquery:** Support for finer-grained configuration of query timeout and retry when defining your [connection profile](bigquery-profile).
- **dbt-spark** added support for a [`session` connection method](spark-profile#session), for use with a pySpark session, to support rapid iteration when developing advanced or experimental functionality. This connection method is not recommended for new users, and it is not supported in dbt Cloud.
94 changes: 92 additions & 2 deletions website/docs/reference/warehouse-profiles/bigquery-profile.md
Original file line number Diff line number Diff line change
Expand Up @@ -195,7 +195,93 @@ my-profile:
priority: interactive
```
### Timeouts
### Timeouts and Retries
<VersionBlock firstVersion="1.1">
The `dbt-bigquery` plugin uses the BigQuery Python client library to submit queries. Each query requires two steps:
1. Job creation: Submit the query job to BigQuery, and receive its job ID.
2. Job execution: Wait for the query job to finish executing, and receive its result.

Some queries inevitably fail, at different points in process. To handle these cases, dbt supports fine-grained configuration for query timeouts and retries.

#### job_execution_timeout_seconds

Use the `job_execution_timeout_seconds` configuration to set the number of seconds dbt should wait for queries to complete, after being submitted successfully. Of the four configurations that control timeout and retries, this one is the most common to use.

:::info Renamed config

In older versions of `dbt-bigquery`, this same config was called `timeout_seconds`.

:::

The default value is 300 seconds. If any dbt query, including a model's SQL transformation, takes longer than 300 seconds to complete, BigQuery might cancel the query and issue the following error:

```
Operation did not complete within the designated timeout.
```

You can change the timeout seconds for the job execution step by configuring `job_execution_timeout_seconds` in the BigQuery profile:

```yaml
my-profile:
target: dev
outputs:
dev:
type: bigquery
method: oauth
project: abc-123
dataset: my_dataset
job_execution_timeout_seconds: 600 # 10 minutes
```

#### job_creation_timeout_seconds

It is also possible for a query job to fail to submit in the first place. You can configure the maximum timeout for the job creation step by configuring `job_creation_timeout_seconds`. No timeout is set by default.

In the job creation step, dbt is simply submitting a query job to BigQuery's `Jobs.Insert` API, and receiving a query job ID in return. It should take a few seconds at most. In some rare situations, it could take longer.

#### job_retries

Google's BigQuery Python client has native support for retrying query jobs that time out, or queries that run into transient errors and are likely to succeed if run again. You can configure the maximum number of retries by configuring `job_retries`.

:::info Renamed config

In older versions of `dbt-bigquery`, the `job_retries` config was just called `retries`.

:::

The default value is 1, meaning that dbt will retry failing queries exactly once. You can set the configuration to 0 to disable retries entirely.

#### job_retry_deadline_seconds

After a query job times out, or encounters a transient error, dbt will wait one second before retrying the same query. In cases where queries are repeatedly timing out, this can add up to a long wait. You can set the `job_retry_deadline_seconds` configuration to set the total number of seconds you're willing to wait ("deadline") while retrying the same query. If dbt hits the deadline, it will give up and return an error.

Combining the four configurations above, we can maximize our chances of mitigating intermittent query errors. In the example below, we will wait up to 30 seconds for initial job creation. Then, we'll wait up to 10 minutes (600 seconds) for the query to return results. If the query times out, or encounters a transient error, we will retry it up to 5 times. The whole process cannot take longer than 20 minutes (1200 seconds). At that point, dbt will raise an error.

<File name='profiles.yml'>

```yaml
my-profile:
target: dev
outputs:
dev:
type: bigquery
method: oauth
project: abc-123
dataset: my_dataset
job_creation_timeout_seconds: 30
job_execution_timeout_seconds: 600
job_retries: 5
job_retry_deadline_seconds: 1200
```

</File>

</VersionBlock>

<VersionBlock lastVersion="1.0">

BigQuery supports query timeouts. By default, the timeout is set to 300 seconds. If a dbt model takes longer than this timeout to complete, then BigQuery may cancel the query and issue the following error:

Expand All @@ -205,6 +291,8 @@ BigQuery supports query timeouts. By default, the timeout is set to 300 seconds.

To change this timeout, use the `timeout_seconds` configuration:

<File name='profiles.yml'>

```yaml
my-profile:
target: dev
Expand All @@ -217,7 +305,7 @@ my-profile:
timeout_seconds: 600 # 10 minutes
```

### Retries
</File>

The `retries` profile configuration designates the number of times dbt should retry queries that result in unhandled server errors. This configuration is only specified for BigQuery targets. Example:

Expand All @@ -241,6 +329,8 @@ my-profile:

</File>

</VersionBlock>

### Dataset locations

The location of BigQuery datasets can be configured using the `location` configuration in a BigQuery profile.
Expand Down

0 comments on commit ef8014e

Please sign in to comment.