Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mutually_exclusive_ranges test breaks when using --store-failures on BigQuery #423

Closed
1 of 5 tasks
codigo-ergo-sum opened this issue Sep 23, 2021 · 2 comments
Closed
1 of 5 tasks
Labels
bug Something isn't working good first issue

Comments

@codigo-ergo-sum
Copy link
Contributor

Describe the bug

When running a test suite which includes the mutually_exclusive_ranges test and using the --store-failures option, dbt fails when trying to insert rows into the target database, specifically BigQuery. Here are some example error messages that we've received:

Database Error in test dbt_utils_mutually_exclusive_ranges_qb_payment_filter_not_allowed__dbt_valid_from__qb_payment_id_invoice_payment_id__dbt_valid_to (models/staging/filters/qb/qb_payment_filter.yml)
  CREATE TABLE columns must be named, but column 1 has no name at [9:6]

Database Error in test dbt_utils_mutually_exclusive_ranges_qb_project_balances_filter_not_allowed__dbt_valid_from__qb_project_id_qb_account_id__dbt_valid_to (models/staging/filters/qb/qb_project_balances_filter.yml)
  CREATE TABLE columns must be named, but column 1 has no name at [9:6]

Steps to reproduce

  1. Have a mutually_exclusive_ranges test
  2. Run dbt test --store-failures

Expected results

The failures would be logged into a table in BigQuery and there would be no "Database Error" error messages

Actual results

We get the errors above

Screenshots and log output

Database Error in test dbt_utils_mutually_exclusive_ranges_qb_payment_filter_not_allowed__dbt_valid_from__qb_payment_id_invoice_payment_id__dbt_valid_to (models/staging/filters/qb/qb_payment_filter.yml)
  CREATE TABLE columns must be named, but column 1 has no name at [9:6]

Database Error in test dbt_utils_mutually_exclusive_ranges_qb_project_balances_filter_not_allowed__dbt_valid_from__qb_project_id_qb_account_id__dbt_valid_to (models/staging/filters/qb/qb_project_balances_filter.yml)
  CREATE TABLE columns must be named, but column 1 has no name at [9:6]

System information

packages:

 - package: dbt-labs/dbt_utils
   version: 0.7.3

 - package: tnightengale/dbt_meta_testing
   version: 0.3.3

Which database are you using dbt with?

  • postgres
  • redshift
  • bigquery
  • snowflake
  • other (specify: ____________)

The output of dbt --version:

<installed version: 0.20.2
   latest version: 0.20.2

Up to date!

Plugins:
  - bigquery: 0.20.2
  - snowflake: 0.20.2
  - redshift: 0.20.2
  - postgres: 0.20.2

Additional context

Are you interested in contributing the fix?

I could try... :) I'd need to learn the build and testing toolchain but am interested in doing so

@codigo-ergo-sum codigo-ergo-sum added bug Something isn't working triage labels Sep 23, 2021
@codigo-ergo-sum
Copy link
Contributor Author

codigo-ergo-sum commented Sep 23, 2021

I think the issue is here, a snippet of the compiled SQL:

    select
        
        qb_project_id || '|' || qb_account_id,
        
        dbt_valid_from as lower_bound,
        dbt_valid_to as upper_bound,

        lead(dbt_valid_from) over (
            partition by qb_project_id || '|' || qb_account_id
            order by dbt_valid_from
        ) as next_lower_bound,

        row_number() over (
            partition by qb_project_id || '|' || qb_account_id
            order by dbt_valid_from desc
        ) = 1 as is_last_record

    from my_project.sandbox.qb_project_balances_filter

There needs to be an alias on that first line qb_project_id || '|' || qb_account_id,

Looks like this would be a fix in this file:

Line 38, specifically.

This might only happen in cases where the mutually exclusive test is using a composite/constructed "partition" from the data - this error probably wouldn't be triggered in a case where the "partition" field in the .yml configuration is a single column.

Here's the test config for us that causes the error:

      - dbt_utils.mutually_exclusive_ranges:
          lower_bound_column: dbt_valid_from
          upper_bound_column: dbt_valid_to
          partition_by: qb_project_id || '|' || qb_account_id
          gaps: not_allowed

@joellabes
Copy link
Contributor

Hey @codigo-ergo-sum - thanks for opening this!

I agree with your diagnosis - by concatenating two columns you wind up with an unaliased column, and BQ won't let you turn that into a table.

We should explicitly alias this. If you wanted to have a go at it yourself, you're more than welcome to - the integration tests all run on PR so you don't need to do any setup (and I don't think the instructions are in a good state right now, so I probably wouldn't 😬 ). You can just commit the changed macro and open a PR. I'm hoping we'll cut a version 0.8.0 of dbt-utils in the next month or so; certainly before dbt Core 1.0 comes out.

In the meantime, you may want to host your own copy of this test in your macros folder with the alias baked in.

I was hoping you'd be able to work around this by putting the alias into your YAML:

      - dbt_utils.mutually_exclusive_ranges:
          lower_bound_column: dbt_valid_from
          upper_bound_column: dbt_valid_to
          partition_by: qb_project_id || '|' || qb_account_id as partition_by
          gaps: not_allowed

This doesn't work because we’re being very clever here and generating a partition clause, which isn't allowed to contain an alias.

codigo-ergo-sum added a commit to codigo-ergo-sum/dbt-utils that referenced this issue Oct 20, 2021
…with store failures with a multi-column concatenated partition by clause. As per: dbt-labs#423
joellabes pushed a commit that referenced this issue Nov 9, 2021
#430)

* Fixing an issue where the mutually exclusive range test doesn't work with store failures with a multi-column concatenated partition by clause. As per: #423

* Bumping dbt version to take into account CircleCI already being upgrade to dbt 1.0b as per Joel Labes

* Further tweak to version number checks to try to fix CircleCI

* Update pull_request_template.md (#438)

* Update pull_request_template.md

* Update .github/pull_request_template.md

* Add 0.8.0 changelog block

* Add changelog for mutually excl ranges change
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working good first issue
Projects
None yet
Development

No branches or pull requests

2 participants