Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] Data tests do not quote columns names when mixed case #10477

Closed
2 tasks done
urkle opened this issue Jul 23, 2024 · 7 comments
Closed
2 tasks done

[Bug] Data tests do not quote columns names when mixed case #10477

urkle opened this issue Jul 23, 2024 · 7 comments
Labels
bug Something isn't working dbt tests Issues related to built-in dbt testing functionality wontfix Not a bug or out of scope for dbt-core

Comments

@urkle
Copy link

urkle commented Jul 23, 2024

Is this a new bug in dbt-core?

  • I believe this is a new bug in dbt-core
  • I have searched the existing issues, and I could not find an existing issue for this bug

Current Behavior

We have models that use mixed-case column names (and table names), however when defining column tests the generated SQL does not quote the column names thus the queries fail on postgres (and other DBs that are case sensitive)

select
    myColumn as unique_field,
    count(*) as n_records

from "my_db"."public"."myModel"
where myColumn is not null
group by myColumn
having count(*) > 1

Expected Behavior

DB Columns to be quoted so case is handled correctly.

select
    "myColumn" as unique_field,
    count(*) as n_records

from "my_db"."public"."myModel"
where "myColumn" is not null
group by "myColumn"
having count(*) > 1

Steps To Reproduce

Model definition yml

models:
  -name: myModel
    columns:
      - name: myColumn
         data_tests:
           - unique

Relevant log output

No response

Environment

- OS: macOS 12.5
- Python: python 3.11.9
- dbt: 1.8.4

Which database adapter are you using with dbt?

postgres

Additional Context

No response

@urkle urkle added bug Something isn't working triage labels Jul 23, 2024
@urkle
Copy link
Author

urkle commented Jul 23, 2024

I implemented a local fix by overriding the test macros to wrap column_name as adapter.quote(column_name)

@urkle
Copy link
Author

urkle commented Jul 23, 2024

final macro changes that work (though needs a new quote parameter added to the main test definition)

{% macro default__test_unique(model, column_name, quote = True) %}
{%- set column_quoted = adapter.quote(column_name) if quote else column_name -%}
select
    {{ column_quoted }} as unique_field,
    count(*) as n_records

from {{ model }}
where {{ column_quoted }} is not null
group by {{ column_quoted }}
having count(*) > 1

{% endmacro %}

This allows this to work

models:
  -name: myModel
    data_tests:
       - unique:
            quote: False
            column_name: |
               "myColumn" || ':' || "myOtherColumn"
    columns:
      - name: myColumn
         data_tests:
           - unique

@jeremyyeo
Copy link
Contributor

jeremyyeo commented Jul 23, 2024

Did you try the quote property on the column https://docs.getdbt.com/reference/resource-properties/quote?

@dbeatty10 dbeatty10 added the dbt tests Issues related to built-in dbt testing functionality label Jul 23, 2024
@dbeatty10
Copy link
Contributor

This properly quoted for me when I used the quote property that @jeremyyeo mentioned:

models/my_model.sql

select null as "Id"

models/_models.yml

models:
  - name: my_model
    columns:
      - name: Id
        quote: true
        data_tests:
          - not_null

Here's the SQL in target/compiled/my_project/models/_models.yml/not_null_my_model__Id_.sql

select "Id"
from "db"."my_schema"."my_model"
where "Id" is null

@urkle
Copy link
Author

urkle commented Jul 24, 2024

@jeremyyeo i hunted all through the docs and source and could only find quote_columns (for seeds!).

Is there a way to turn that quote setting on for ALL columns in a model? (or even all models in a project?)

And I'll try that right now to see if it works.

@urkle
Copy link
Author

urkle commented Jul 24, 2024

@jeremyyeo that does indeed work and gets the columns quoted. It would be nice to be able to set it for the whole project.

(e.g. a new key here? https://docs.getdbt.com/reference/project-configs/quoting ) and/or some per-model setting

@dbeatty10
Copy link
Contributor

@jeremyyeo that does indeed work and gets the columns quoted.

🎉

It would be nice to be able to set it for the whole project.
(e.g. a new key here? https://docs.getdbt.com/reference/project-configs/quoting ) and/or some per-model setting

Yeah! 🤩 We're imagining adding the following new config to dbt_project.yml (as described in #2986):

quote:
  database: true|false    # or `project` on dbt-bigquery
  schema: true|false      # or `dataset` on dbt-bigquery
  identifier: true|false
  columns: true|false

Since the quote property allows for quoting columns in data tests and #2986 proposes a way to apply similar quoting project-wide, I'm going to close this issue.

But please feel free to reach out if you have any other feedback or ideas.

@dbeatty10 dbeatty10 added wontfix Not a bug or out of scope for dbt-core and removed triage labels Jul 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working dbt tests Issues related to built-in dbt testing functionality wontfix Not a bug or out of scope for dbt-core
Projects
None yet
Development

No branches or pull requests

3 participants