-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support cluster_by on snowflake #728
Support cluster_by on snowflake #728
Conversation
Issue dbt-labs#634 add cluster_by support for snowflake data warehouses.
{%- set _clusterby = [_clusterby] -%} | ||
{%- endif -%} | ||
{%- for item in _clusterby -%} | ||
"{{ item }}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're currently experiencing a ton of pain around quoting on Snowflake. I think our default approach going forward will be to not quote identifiers in dbt, but instead ask the user to explicitly quote fields like this if desired. Do you buy that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another benefit of not quoting:
You made a great example of using cluster_by
in the sample.dbt_project.yml
file below:
clusterby: ["date_day", "ad_group_id % 10"]
If dbt adds quotes, then I think the expression ad_group_id % 10
will look like an identifier, and Snowflake will do the wrong thing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I buy that! We are definitely in the "don't quote identifiers and let the database do it's thing" camp. Allowing users to override that behavior on a case-by-case basis makes sense.
We have an edge case in our Vertica to Snowflake migration, where we manually created a handful of views that quote identifiers lower-cased versions of the table names. This is so that our pre-existing Tableau reports, which quote a lower-cased version of the table name, continue to function without change.
@jon-rtr I played around with this a little bit, and I think we still need to address the fact that Snowflake requires the table DDL to be specified along with For example, this works:
but this does not:
Do you know if it's possible to use |
@jon-rtr we're closing out some stale PRs -- did you have a chance to give this any more thought? I'm definitely keen to add support for |
Issue #634 add cluster_by support for snowflake data warehouses.