Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dbt Constraints / model contracts #426

Merged
merged 25 commits into from
Feb 17, 2023
Merged

dbt Constraints / model contracts #426

merged 25 commits into from
Feb 17, 2023

Conversation

Victoriapm
Copy link
Contributor

@Victoriapm Victoriapm commented Dec 12, 2022

resolves #444
Related to dbt-core issue #6079, allowing to define constraints and data types in model config for BigQuery
Related to #6271

Description

  • The macro will retrieve data types and constraints defined in the model YML file and use it to generate the model DDL
  • All columns must be defined in the YML in the expected order, each of them will require a valid BigQuery data type
  • BigQuery only enforces not null constraint, others like check, will fail

Related Core/Adapter Pull Requests

Must be reviewed with passing tests

Checklist

@cla-bot
Copy link

cla-bot bot commented Dec 12, 2022

Thanks for your pull request, and welcome to our community! We require contributors to sign our Contributor License Agreement and we don't seem to have your signature on file. Check out this article for more information on why we have a CLA.

In order for us to review and merge your code, please submit the Individual Contributor License Agreement form attached above above. If you have questions about the CLA, or if you believe you've received this message in error, don't hesitate to ping @drewbanin.

CLA has not been signed by users: @Victoriapm

@github-actions
Copy link
Contributor

Thank you for your pull request! We could not find a changelog entry for this change. For details on how to document a change, see the dbt-bigquery contributing guide.

@cla-bot cla-bot bot added the cla:yes label Dec 14, 2022

def test__constraints_enforcement(self, project):

results, log_output = run_dbt_and_capture(['run', '-s', 'my_model_error'], expect_pass=False)
Copy link
Contributor

@sungchun12 sungchun12 Dec 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Run the raw sql DDL manually against a bigquery instance to debug why it's running successfully: https://github.com/dbt-labs/dbt-bigquery/actions/runs/3744300377/jobs/6357515726#step:7:755

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DDL runs in BQ, also config +models run as expected according to the test in a dbt project where I've been testing this

with open("./models/my_model.sql", "w") as fp:
fp.write(my_model_sql_error)

results = run_dbt(["run", "-s", "my_model"], expect_pass=False)
Copy link
Contributor

@sungchun12 sungchun12 Dec 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Run the raw sql DDL manually against a bigquery instance to debug why it's running successfully: https://github.com/dbt-labs/dbt-bigquery/actions/runs/3744300377/jobs/6357515726#step:7:653

@sungchun12
Copy link
Contributor

This test is failing because the schema configs you import from the existing tests has text as a data_type and that isn't allowed by BigQuery. Can you figure out how to override that in your tests? https://github.com/dbt-labs/dbt-bigquery/actions/runs/3751704864/jobs/6373025699#step:7:511

{% for i in user_provided_columns -%}
{%- set col = user_provided_columns[i] -%}
{% set constraints = col['constraints'] -%}
{%- set ns.at_least_one_check = ns.at_least_one_check or col['check'] %}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
{%- set ns.at_least_one_check = ns.at_least_one_check or col['check'] %}
{%- set ns.at_least_one_check = ns.at_least_one_check or col['constraints_check'] %}

Do a find and replace on all check configs in all files to be constraints_check by name.

@@ -52,8 +52,10 @@
{{ sql_header if sql_header is not none }}

create or replace table {{ relation }}
{{ get_columns_spec_ddl() }}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
{{ get_columns_spec_ddl() }}
{{ get_columns_spec_ddl() }}
{{ get_assert_columns_equivalent(sql) }}

This creates a validation check that columns in a sql statement match the schema config

@@ -0,0 +1,18 @@
{% macro bigquery__get_columns_spec_ddl() %}
{# loop through user_provided_columns to create DDL with data types and constraints #}
{% if config.get('constraints_enabled', False) %}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like the if statement been added to adapters.sql, and can now be deleted from this macro.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch!

@Fleid
Copy link
Contributor

Fleid commented Jan 30, 2023

@Victoriapm @sungchun12 I'm supposing this is still being worked on.
Please ping me when/if you want it to be reviewed by the adapters team ;)

@sungchun12 sungchun12 marked this pull request as ready for review January 31, 2023 16:25
@sungchun12
Copy link
Contributor

@Victoriapm can you cross-check the other adapter PRs, and update this branch with additional changes as needed?

@dbeatty10 dbeatty10 changed the title Create bigquery adapter macro for constraints dbt Constraints / model contracts Feb 14, 2023
@jtcohen6 jtcohen6 merged commit f26fadd into main Feb 17, 2023
@jtcohen6 jtcohen6 deleted the dbt-constraints-bigquery branch February 17, 2023 11:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Development

Successfully merging this pull request may close these issues.

[CT-1696] [Feature] Support constraints in BigQuery following the dbt Core implementation
5 participants