Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change to union_relations in 0.8.1 causes error about no columns being found #505

Closed
4 tasks
andrewpblee opened this issue Feb 23, 2022 · 14 comments
Closed
4 tasks
Labels
bug Something isn't working

Comments

@andrewpblee
Copy link

Describe the bug

Today our activities model, which uses the union_relations macro, to stitch together a few models, has broken within our SQLFluff Github Action, with the following error:

TMP: dbt compilation error on file 'models/enrichment/en__activities.sql', There were no columns found to union for relations int__activities__customer_order_created, int__activities__non_customer_order_created, int__activities__order_adjusted, int__activities__order_line_created, int__activities__order_line_refunded

This error message looks like it was introduced in 0.8.1, which appears to amend how include and exclude works, but our model does not use these fields, here it is:

WITH
activities AS (
    {{ dbt_utils.union_relations(
        relations=[
            ref('int__activities__customer_order_created'),
            ref('int__activities__non_customer_order_created'),
            ref('int__activities__order_adjusted'),
            ref('int__activities__order_line_created'),
            ref('int__activities__order_line_refunded')
        ]
    ) }}
)
....

Steps to reproduce

With the following:

dbt_core==1.0.*
dbt_snowflake==1.0.*
sqlfluff-templater-dbt==0.9.*

and packages:

packages:
  - package: calogica/dbt_expectations
    version: 0.5.1

Create a model using union_relations, and try to compile the code

Expected results

The model should compile without issue

Actual results

The compile fails, stating that there are no columns found in the relations

System information

The contents of your packages.yml file:

  • package: calogica/dbt_expectations
    version: 0.5.1

Which database are you using dbt with?

  • postgres
  • redshift
  • bigquery
  • [ x ] snowflake
  • other (specify: ____________)

The output of dbt --version:

installed version: 1.0.3
   latest version: 1.0.3

Up to date!

Plugins:
  - snowflake: 1.0.0 - Up to date!

Additional context

Looks related to this column_superset.keys() list from https://github.com/dbt-labs/dbt-utils/blob/main/macros/sql/union.sql#L64

Happy to share more info and screen grabs of our model and the underlying models, and from our github action as well (it's basically running sqlfluff lint on changed files)

Are you interested in contributing the fix?

@andrewpblee andrewpblee added bug Something isn't working triage labels Feb 23, 2022
@joellabes
Copy link
Contributor

Hi @andrewpblee, sorry for your trouble here! This is curious as during testing everything worked correctly. I'm wondering whether the absence of one or more of the include/exclude params behaves differently.

Could you try one or both of these:

  • manually pass in all columns' names for one of your unioning models to see if that corrects it
  • Making a custom version of the utils macro inside of your project, which changes this line to be something like {% if (include | length > 0 or exclude | length > 0) and not column_superset.keys() %}

https://github.com/dbt-labs/dbt-utils/blob/main/macros/sql/union.sql#L64

I suspect that might restore the old behaviour - if it does then I'll ship another patch ASAP.

Writing this on mobile which is why I'm delegating the debugging - thanks in advance for your help!

@joellabes joellabes removed the triage label Feb 23, 2022
@andrewpblee
Copy link
Author

hey @joellabes thanks - and no worries - i'll try the custom macro version first and see how that reacts

@andrewpblee
Copy link
Author

It works! - I copied the macro into a new file, swapped out the line as you said and the github action passes with dbt utils at 0.8.1

@joellabes
Copy link
Contributor

OK! Thanks for doing the research. We'll get that properly released later today hopefully, at which point you can safely delete your version

@andrewpblee
Copy link
Author

Thanks @joellabes, I'm intrigued why this is only happening within our github actions (we use sqlfluff and pre-commit-dbt and both errored) and not when I compile locally.

@joellabes
Copy link
Contributor

Hmm! That is weird - we couldn't repro it here either. Can you share the columns in the models you're trying to union? Our only guess so far is something along the lines of "maybe the third model doesn't share any columns with the first one".

@thisiscab
Copy link

thisiscab commented Feb 25, 2022

Hello, happy to help contribute to get this issue resolved. We're also experiencing the exact same problem.

@joellabes
Copy link
Contributor

@thisiscab thank you! In the short term, it looks like the workaround above will see you right - would appreciate any background you can share.

Is yours also only happening with sqlfluff and other CI checks? Is there anything unexpected about the columns you're combining? Do you define a set of include/exclude values in any of your usages of the macro?

@thisiscab
Copy link

thisiscab commented Feb 25, 2022

Here is how we're using the the macro:

{%- set table_that_contains_locations = [
        ref('stg_chord__shopify_orders'),
        ref('stg_chord__shopify_fulfillments'),
    ]
-%}

with source as (

    select * from {{ source }}

),

all_location_ids_outside_main_table as (

    {{ dbt_utils.union_relations(
        relations=table_that_contains_locations,
        include=[
            chord_utils.target_casing('location_id'),
            chord_utils.target_casing('created_at'),
        ]
    ) }}

),

...

It's failing in our CI checks when running the command dbt compile.

We've hard-coded the package's version 0.8.0 which is the latest known working version, so we don't need to "do" a workaround per say!

@andrewpblee
Copy link
Author

hey @joellabes, so this is our activities one-big-table, that has everything a customer ever did with us, each activity must have a core fields in common (e.g customer_id and activity_type) , so there shouldn't be anything in one model that isn't in another. Happy to share the columns if you need though

@GClunies
Copy link

GClunies commented Mar 2, 2022

@joellabes I am seeing similar issues in our dbt project @ Surfline. All started when I ran dbt deps which bumped dbt_utils version from 0.8.0 to 0.8.1. Originally, dbt_utils was installed by calogica/dbt_date. We've hard coded the dependency like @thisiscab called out to

packages:
  - package: calogica/dbt_date  # This includes dbt-labs/dbt-utils
    version: [">=0.5.0", "<0.6.0"]

  - package: dbt-labs/dbt_utils  # This is included due to a bug in union_relations for v0.8.1
    version: 0.8.0

  - package: dbt-labs/audit_helper
    version: 0.5.0

  - package: dbt-labs/codegen
    version: 0.5.0

The SQL where this failed was

with combined as (

    -- noqa: disable=all
    {{ dbt_utils.union_relations(
        relations=[
            ref('base_wavetrak_itunes_subscription_event_schema_1_2'),
            ref('base_wavetrak_itunes_subscription_event_schema_1_2_stitch')
        ],
        exclude=[
            'apple_report_datetime',
            'id',
            '_dbt_source_relation',
            '_sdc_batched_at',
            '_sdc_extracted_at',
            '_sdc_received_at',
            '_sdc_sequence',
            '_sdc_table_version'
        ]
        )
    }}
    -- noqa: enable=all

),

As a sanity check that it wasn't something upstream for our extract-load, I queried the underlying tables and they do have columns to union.

@joellabes
Copy link
Contributor

@GClunies are you running into issues only in secondary CI contexts (e.g. SQLFluff and pre-commit-dbt) vs deployment, or is it impacting your main workflow as well?

This is a weird one; I would prefer to understand it before fixing it, to be sure that we're actually fixing it, but since the workaround above protects users who aren't even using include/exclude, I think we'll ship that now and keep trying to track down the underlying issue.

@GClunies
Copy link

GClunies commented Mar 2, 2022

@joellabes I ran into it while developing locally. Error surfaced when running dbt build -s +some_model --full-refresh

@joellabes
Copy link
Contributor

OK thanks! I've just released dbt_utils 0.8.2 which has a fix, so I'll close this out. Sorry for the hassle all!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants