Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] Casing not preserved in seeds #178

Open
2 tasks done
gwenwindflower opened this issue Apr 18, 2024 · 3 comments
Open
2 tasks done

[Bug] Casing not preserved in seeds #178

gwenwindflower opened this issue Apr 18, 2024 · 3 comments
Labels
bug Something isn't working case_sensitivity Issues related to dbt's case-sensitivity behavior quoting Issues related to dbt's quoting behavior

Comments

@gwenwindflower
Copy link

gwenwindflower commented Apr 18, 2024

Is this a new bug?

  • I believe this is a new bug
  • I have searched the existing issues, and I could not find an existing issue for this bug

Current Behavior

dbt appears to successfully create new seeds from case sensitive file names e.g. mySeed but then errors with an 'approximate match' error later on because the seed created actually follows the default casing of the warehouse (lower on Redshift, UPPER on Snowflake, etc). Having a cased seed in your project essentially results in an unclear error if you're using an adapter that defaults to casing one way or the other.

Expected Behavior

I would expect one of two things:

  1. dbt throws an error or warning about cased seeds
  2. dbt preserves the casing of the seed

Steps To Reproduce

  1. On Redshift or Snowflake (any adapter that has a default lower or UPPER casing policy will do)
  2. Create a seed called 'mySeed.csv' with whatever data
  3. dbt seed.
  4. dbt seed again or dbt build, anything that would involve referencing that seed.

Relevant log output

/home/circleci/project
[0m14:35:31  [[33mWARNING[0m]: Deprecated functionality

User config should be moved from the 'config' key in profiles.yml to the 'flags' key in dbt_project.yml.
[0m14:35:31  Running with dbt=1.8.0-b2
[0m14:35:31  Installing ../
[0m14:35:31  Installed from <local @ ../>
[0m14:35:31  Installing dbt-labs/dbt_utils
[0m14:35:31  Installed from version 1.1.1
[0m14:35:31  Up to date!
[0m14:35:33  [[33mWARNING[0m]: Deprecated functionality

User config should be moved from the 'config' key in profiles.yml to the 'flags' key in dbt_project.yml.
[0m14:35:33  Running with dbt=1.8.0-b2
[0m14:35:33  Registered adapter: redshift=1.8.0-b2
[0m14:35:33  Unable to do partial parsing because config vars, config profile, or config target have changed
[0m14:35:33  Unable to do partial parsing because env vars used in profiles.yml have changed
[0m14:35:33  Unable to do partial parsing because a project dependency has been added
[0m14:35:36  Found 6 models, 23 data tests, 4 seeds, 2 sources, 617 macros
[0m14:35:58  [[33mWARNING[0m]: Deprecated functionality

User config should be moved from the 'config' key in profiles.yml to the 'flags' key in dbt_project.yml.
[0m14:35:58  Running with dbt=1.8.0-b2
[0m14:35:59  Registered adapter: redshift=1.8.0-b2
[0m14:35:59  Found 6 models, 23 data tests, 4 seeds, 2 sources, 617 macros
[0m14:35:59  
[0m14:36:01  Concurrency: 1 threads (target='redshift')
[0m14:36:01  
[0m14:36:01  1 of 4 START seed file codegen_integration_tests_redshift_Raw_Data_Case_Sensitive.data__Case_Sensitive  [RUN]
[0m14:36:01  1 of 4 ERROR loading seed file codegen_integration_tests_redshift_Raw_Data_Case_Sensitive.data__Case_Sensitive  [[31mERROR[0m in 0.03s]
[0m14:36:01  2 of 4 START seed file codegen_integration_tests_redshift_raw_data.data__a_relation  [RUN]
[0m14:36:02  2 of 4 OK loaded seed file codegen_integration_tests_redshift_raw_data.data__a_relation  [[32mCREATE 2[0m in 0.70s]
[0m14:36:02  3 of 4 START seed file codegen_integration_tests_redshift_raw_data.data__b_relation  [RUN]
[0m14:36:02  3 of 4 OK loaded seed file codegen_integration_tests_redshift_raw_data.data__b_relation  [[32mCREATE 2[0m in 0.60s]
[0m14:36:02  4 of 4 START seed file codegen_integration_tests_redshift_raw_data.data__campaign_analytics  [RUN]
[0m14:36:03  4 of 4 OK loaded seed file codegen_integration_tests_redshift_raw_data.data__campaign_analytics  [[32mCREATE 3[0m in 0.63s]
[0m14:36:03  
[0m14:36:03  Finished running 4 seeds in 0 hours 0 minutes and 3.81 seconds (3.81s).
[0m14:36:03  
[0m14:36:03  [31mCompleted with 1 error and 0 warnings:[0m
[0m14:36:03  
[0m14:36:03    Compilation Error in seed data__Case_Sensitive (seeds/data__Case_Sensitive.csv)
  When searching for a relation, dbt found an approximate match. Instead of guessing 
  which relation to use, dbt will move on. Please delete "ci"."codegen_integration_tests_redshift_raw_data_case_sensitive"."data__case_sensitive", or rename it to be less ambiguous.
  Searched for: "ci"."codegen_integration_tests_redshift_Raw_Data_Case_Sensitive"."data__Case_Sensitive"
  Found: "ci"."codegen_integration_tests_redshift_raw_data_case_sensitive"."data__case_sensitive"
  
  > in macro materialization_seed_default (macros/materializations/seeds/seed.sql)
  > called by seed data__Case_Sensitive (seeds/data__Case_Sensitive.csv)
[0m14:36:03  
[0m14:36:03  Done. PASS=3 WARN=0 ERROR=1 SKIP=0 TOTAL=4

Environment

- OS: Alpine in Docker 
- Python: 3.9.9
- dbt-adapter: redshift=1.8.0-b2

Additional Context

No response

@gwenwindflower gwenwindflower added bug Something isn't working triage labels Apr 18, 2024
@dataders dataders added the quoting Issues related to dbt's quoting behavior label Apr 19, 2024
@dataders
Copy link

after discussing with @gwenwindflower, I advised that she open the error here because the issue happens on both dbt-snowflake and dbt-redshift

the error is ApproximateMatchError raised by BaseAdapter.matches() (below), but the root cause might be due to the seed materialization and corresponding helper macros not properly quoting.

if approximate_match and not exact_match:
target = self.create(database=database, schema=schema, identifier=identifier)
raise ApproximateMatchError(target, self)

@dataders dataders removed the triage label Apr 19, 2024
@dbeatty10 dbeatty10 added the case_sensitivity Issues related to dbt's case-sensitivity behavior label Jul 11, 2024
@alison985
Copy link

Might be silly, maybe it's not implemented for all adapters, or maybe I'm missing something very obvious, but you can specify quoting seed columns in dbt code.

@dataders
Copy link

dataders commented Sep 10, 2024

i'd be curious to know if wrapping the below this.render() within adapter.quote()

{% set sql %}
insert into {{ this.render() }} ({{ cols_sql }}) values
{% for row in chunk -%}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working case_sensitivity Issues related to dbt's case-sensitivity behavior quoting Issues related to dbt's quoting behavior
Projects
None yet
Development

No branches or pull requests

4 participants