You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
get_tables_by_pattern_sql() macro does not properly support databricks.
Databricks has table_catalog as a field in the information schema for tables. The macro does not respect the provided database in the where clause, and will if you are trying to use 'system' as the database it will instead return tables from all catalogs that match the schema/table patterns.
Steps to reproduce
run any macro that calls get_tables_by_pattern_sql() against databricks using 'system' as database, 'billing' as schema. If another schema called billing exists, it will pull through all tables from all catalogs.
Expected results
Respect the filter against database and only return tables within a specific catalog.
Actual results
Currently, see steps to reproduce. returns all tables across all catalogs if multiple schemas exist with the same name.
Screenshots and log output
System information
packages:
package: dbt-labs/codegen
version: 0.12.1
Which database are you using dbt with?
postgres
redshift
bigquery
snowflake
other (specify: Databricks)
The output of dbt --version:
Core:
- installed: 1.8.5
- latest: 1.8.5 - Up to date!
Plugins:
- databricks: 1.8.5 - Up to date!
- spark: 1.8.0 - Up to date!
Additional context
Are you interested in contributing the fix?
Sure, the fix is to add a databricks specific version of the macro into the get_tables_by_pattern_sql file which supports table_catalog:
{% macro databricks__get_tables_by_pattern_sql(schema_pattern, table_pattern, exclude='', database=target.database) %}
select distinct
table_schema as {{ adapter.quote('table_schema') }},
table_name as {{ adapter.quote('table_name') }},
{{ dbt_utils.get_table_types_sql() }}
from {{ database }}.information_schema.tables
where table_catalog ilike '{{database}}'
and table_schema ilike '{{ schema_pattern }}'
and table_name ilike '{{ table_pattern }}'
and table_name not ilike '{{ exclude }}'
{% endmacro %}
The text was updated successfully, but these errors were encountered:
Describe the bug
get_tables_by_pattern_sql() macro does not properly support databricks.
Databricks has
table_catalog
as a field in the information schema for tables. The macro does not respect the provided database in thewhere
clause, and will if you are trying to use 'system' as the database it will instead return tables from all catalogs that match the schema/table patterns.Steps to reproduce
run any macro that calls get_tables_by_pattern_sql() against databricks using 'system' as database, 'billing' as schema. If another schema called billing exists, it will pull through all tables from all catalogs.
Expected results
Respect the filter against database and only return tables within a specific catalog.
Actual results
Currently, see steps to reproduce. returns all tables across all catalogs if multiple schemas exist with the same name.
Screenshots and log output
System information
packages:
version: 0.12.1
Which database are you using dbt with?
The output of
dbt --version
:Additional context
Are you interested in contributing the fix?
Sure, the fix is to add a databricks specific version of the macro into the get_tables_by_pattern_sql file which supports table_catalog:
The text was updated successfully, but these errors were encountered: