Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Quoting in Snowflake Merge Statements #3129

Closed
4 tasks
skinnms opened this issue Feb 24, 2021 · 4 comments
Closed
4 tasks

Quoting in Snowflake Merge Statements #3129

skinnms opened this issue Feb 24, 2021 · 4 comments
Labels
bug Something isn't working good_first_issue Straightforward + self-contained changes, good for new contributors! stale Issues that have gone stale

Comments

@skinnms
Copy link

skinnms commented Feb 24, 2021

Describe the bug

A clear and concise description of what the bug is. What command did you run? What happened?

When running incremental models with quoting turned on in the schema.yml file, if you use that column in your config block as the unique_key, it does not quote that column in the merge statement. This is causing our merge statements to fail.

Steps To Reproduce

In as much detail as possible, please provide steps to reproduce the issue. Sample data that triggers the issue, example model code, etc is all very helpful here.

Using a column that is all lowercase in snowflake as a unique_key for an incremental materialization with quoting turned on in the schema.yml file for that column the merge will fail with invalid identifier error.

Expected behavior

A clear and concise description of what you expected to happen.

When quoting is turned on for a column it is quoted in the merge statement that is compiled.

Screenshots and log output

If applicable, add screenshots or log output to help explain your problem.

bad SQL:
merge into IT_DB.INT_SCH.incident_merge as DBT_INTERNAL_DEST
using IT_DB.INT_SCH.incident_merge__dbt_tmp as DBT_INTERNAL_SOURCE
on
DBT_INTERNAL_SOURCE.number = DBT_INTERNAL_DEST.number

Good SQL:
merge into IT_DB.INT_SCH.incident_merge as DBT_INTERNAL_DEST
using IT_DB.INT_SCH.incident_merge__dbt_tmp as DBT_INTERNAL_SOURCE
on
DBT_INTERNAL_SOURCE."number" = DBT_INTERNAL_DEST."number"

System information

Which database are you using dbt with?

  • postgres
  • redshift
  • bigquery
  • [ x] snowflake
  • other (specify: ____________)

The output of dbt --version:

0.18.1

The operating system you're using:
windows
The output of python --version:
3.8.9

Additional context

Add any other context about the problem here.

@skinnms skinnms added bug Something isn't working triage labels Feb 24, 2021
@jtcohen6 jtcohen6 removed the triage label Feb 25, 2021
@jtcohen6
Copy link
Contributor

Hey @skinnms, thanks for the issue!

I imagine your incremental model looks something like:

{{ config(
    materialized = 'incremental',
    unique_key = 'number'
) }}

select "number", another_column, ts
from table
{% if is_incremental() %}
where ts > (select max(ts) from {{ this }})
{% endif %}

It sounds like you also have resource properties defined like:

models:
  - name: incident_merge
    columns:
      - name: number
        description: Primary key
        quote: true
        tests:
          - unique
          - not_null

The quote: true column property ensures that, when dbt operates on that column (for tests, for persist_docs), it knows to quote the column's identifier in the database.

Today, there is no actual mapping between the number column defined in your .yml file, and the value of unique_key in the model config. That is, unique_key is not a property of the column; it's a model config representing the exact quoted/cased expression you want dbt to use in the materialization DDL/DML. So today, you can avoid your error by specifying the necessary quotes with one of:

{{ config(
    materialized = 'incremental',
    unique_key = '"number"'
) }}
{{ config(
    materialized = 'incremental',
    unique_key = adapter.quote('number')
) }}

Proposal

This is true for other model configs whose value is the name of a column (e.g. cluster_by, partition_by). What I hear you saying, and it's a fair request, is that dbt should:

  • Check to see if the value of a relevant model config matches (by name) a column defined in the model's .yml properties
  • If so, use the quote value set in that column's properties within the incremental materialization

Does that sound right to you?

Future

In a post-#2401 world, where we've reconciled node configs and resource properties as much as possible, we could also think about supporting a specification like:

models:
  - name: incident_merge
    unique_key: number
    columns:
      - name: number
        quote: true

Or even:

models:
  - name: incident_merge
    columns:
      - name: number
        unique_key: true
        quote: true

@skinnms
Copy link
Author

skinnms commented Feb 27, 2021

Hey @jtcohen6 ! Thanks for the quick feedback!

You are spot on. Our model looks very similar as well as the resource paths.

We did end up getting the model to run after trial/error with the below code (we define the directory as +materialized: incremental in our dbt_project.yml file), but it felt a bit hacky so I decided to post an issue. The adapter.quote() seems like a useful short-term fix, but I couldn't find much when Googling around for it. Maybe it could be added to the dbt docs?

{{ config( unique_key = '"number"' ) }}

Proposal
You are spot on again.

Future
I love both options you have provided. My first thought would be to use the second option so that it could be used to provide extra depth in some way in our docs/dag, but would be great to hear from others on the subject.

@jtcohen6 jtcohen6 added the good_first_issue Straightforward + self-contained changes, good for new contributors! label Mar 1, 2021
@jtcohen6
Copy link
Contributor

jtcohen6 commented Mar 1, 2021

Glad you were able to get this working in the meantime! I'm going to mark this a good first issue, since the near-term proposal should be fairly straightforward to implement. I'm also going to link #2986, since we know that the various quoting configs need some love.

It's not a high priority for us while there are functional workarounds, but I'd be happy to help anyone interested in contributing :)

@github-actions
Copy link
Contributor

This issue has been marked as Stale because it has been open for 180 days with no activity. If you would like the issue to remain open, please remove the stale label or comment on the issue, or it will be closed in 7 days.

@github-actions github-actions bot added the stale Issues that have gone stale label Oct 15, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working good_first_issue Straightforward + self-contained changes, good for new contributors! stale Issues that have gone stale
Projects
None yet
Development

No branches or pull requests

2 participants