Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support Delta constraints #71

Merged
merged 5 commits into from
May 6, 2022

Conversation

allisonwang-db
Copy link
Collaborator

@allisonwang-db allisonwang-db commented Apr 1, 2022

Description

This PR adds support for Delta Constraints with dbt table models and incremental models. It supports two types of constraints: model level Check constraint and column level not_null constraint.

Users can specify model-level and column-level constraints under the meta field of the model config. For example:

# schema.yml
models:
  - name: my_model
    meta:
      constraints:
        - name: id_greater_than_zero
          condition: id > 0	
    columns:
      - name: id
      - name: name
        meta:
          constraint: not_null

Constraints specified will be created if the persist_constraints config is enabled (default: false):

-- my_model.sql
{{ config(
    materailized='table',
    persist_constraints=True
) }}
...

Note, Delta constraints are only available in Databricks Runtime 7.4 and above. So DBR 7.3 LTS won't support this feature.

Checklist

  • I have run this code in development and it appears to resolve the stated issue
  • This PR includes tests, or tests are not required/relevant for this PR
  • I have updated the CHANGELOG.md and added information about my change to the "dbt-databricks next" section.

@allisonwang-db
Copy link
Collaborator Author

cc @jtcohen6

Comment on lines 22 to 25
"+persist_docs": {
"relation": True,
"columns": True,
},
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm just wondering if we need to enable them? I'm feeling it's ok to enable automatically if there is such configs in schema.yml.
I also feel a bit weird that the config named persist_docs enables the constraints.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm just wondering if we need to enable them? I'm feeling it's ok to enable automatically if there is such configs in schema.yml.

Fair, I think there's good arguments to be made in either direction. From my perspective, meta is usually for user's own metadata / annotation purposes, with no functional effect in dbt, until/unless they opt into that via an extension macro/package or custom config.

I also feel a bit weird that the config named persist_docs enables the constraints.

Agreed. Even if the mechanism to add these constraints is very very similar to persist_docs, I'd rather see this enabled via another config. Users (and therefore adapter plugins) can define whichever configs they like. All that would require is another macro call within the materialization, right alongside the calls to persist_docs:

{% do persist_docs(target_relation, model) %}

{% do persist_docs(target_relation, model) %}

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make sense! I will use a different config for constraints.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might also need it in snapshot.sql?

{% do persist_docs(target_relation, model) %}

@allisonwang-db allisonwang-db force-pushed the delta-constraints branch 2 times, most recently from 13ca3c6 to 6dd2a8f Compare April 26, 2022 17:25
@allisonwang-db allisonwang-db changed the title Support Delta constraints with persist_docs Support Delta constraints Apr 27, 2022
Copy link
Collaborator

@ueshin ueshin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

Note, Delta constraints are only available in Databricks Runtime 7.4 and above. So DBR 7.3 LTS won't support this feature.

If it doesn't support this feature with DBR 7.3 LTS, we should mention it in README to say:

  • The dbt-databricks adapter has been tested against Databricks SQL and Databricks runtime releases 9.1 LTS and later.

or

  • Delta constraints feature only works with Databricks runtime releases 9.1 LTS and later.

@superdupershant Which would you recommend, or other option?

@ueshin
Copy link
Collaborator

ueshin commented May 6, 2022

Thanks! merging.

@ueshin ueshin merged commit 8db21bc into databricks:main May 6, 2022
ueshin pushed a commit to ueshin/dbt-databricks that referenced this pull request May 6, 2022
### Description
This PR adds support for [Delta Constraints](https://docs.databricks.com/delta/delta-constraints.html) with dbt table models and incremental models. It supports two types of constraints: model level [Check constraint](https://docs.databricks.com/delta/delta-constraints.html#check-constraint) and column level [not_null](https://docs.databricks.com/delta/delta-constraints.html#not-null-constraint) constraint. 

Users can specify model-level and column-level constraints under the `meta` field of the model config. For example:
```yaml
# schema.yml
models:
  - name: my_model
    meta:
      constraints:
        - name: id_greater_than_zero
          condition: id > 0	
    columns:
      - name: id
      - name: name
        meta:
          constraint: not_null
```
Constraints specified will be created if the `persist_constraints` config is enabled (default: false):
```sql
-- my_model.sql
{{ config(
    materailized='table',
    persist_constraints=True
) }}
...
```

Note, Delta constraints are only available in Databricks Runtime 7.4 and above. So DBR 7.3 LTS won't support this feature.
ueshin added a commit that referenced this pull request May 20, 2022
### Description

We introduced Delta constraints at #71 and now that snapshot query could fail when it's violating the constraints.
In that case, the temporary view keeps existing because snapshot uses a permanent view and drop it later.

We should always drop them.
ueshin added a commit to ueshin/dbt-databricks that referenced this pull request May 20, 2022
### Description

We introduced Delta constraints at databricks#71 and now that snapshot query could fail when it's violating the constraints.
In that case, the temporary view keeps existing because snapshot uses a permanent view and drop it later.

We should always drop them.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants