Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

union_relations with exclude param on unmodified models not working with dbt docs generate in CI run #606

Closed
1 of 5 tasks
jeremyyeo opened this issue Jun 14, 2022 · 7 comments · Fixed by #607
Closed
1 of 5 tasks
Labels
bug Something isn't working triage

Comments

@jeremyyeo
Copy link
Contributor

jeremyyeo commented Jun 14, 2022

Describe the bug

Now... that's a long title 😅

Basically if you're using the exclude param in the union_relations macro in a model that isn't part of your CI run, dbt docs generate throws an error.

Steps to reproduce

Preamble

Make sure we already have a prod run (dbt build) and a CI run (dbt run --select state:modified+) setup in dbt Cloud with the generate docs option ticked.

  1. Project setup:
# dbt_project.yml
name: "my_dbt_project"
version: "1.0.0"
config-version: 2

...

models:
  +materialized: view
  my_dbt_project:
    staging:
      schema: stg
-- models/baz.sql
select 'baz' as user_id

-- models/staging/foo.sql
select 'foo' as user_id

-- models/staging/bar.sql
select 'bar' as user_id

-- models/publish/my_union_model.sql
with all_models as (
    {{
        dbt_utils.union_relations(
            relations = [ref('foo'), ref('bar')],
            exclude = ['_loaded_at']
        )
    }}
)

select * from all_models
  1. Do a normal dbt build prod run and see that there are no issues:

image

  1. Now make a modification to our baz model:
-- models/baz.sql
select 'bazbaz' as user_id
  1. Create a PR so that this will trigger a run of your CI job. Inspect the CI run:

image

^ Notice the Runtime Error.

Expected results

dbt docs generate to not error.

Actual results

dbt docs generate errors with a Runtime Error.

Screenshots and log output

2022-06-14 03:12:52.077257 (MainThread): 03:12:52  Running with dbt=1.0.7
2022-06-14 03:12:52.077841 (MainThread): 03:12:52  running dbt with arguments Namespace(cls=<class 'dbt.task.generate.GenerateTask'>, compile=True, debug=None, defer=None, event_buffer_size=None, exclude=None, fail_fast=None, log_cache_events=False, log_format=None, partial_parse=None, printer_width=None, profile='user', profiles_dir='/tmp/jobs/62529925/.dbt', project_dir=None, record_timing_info=None, rpc_method='docs.generate', select=None, selector_name=None, send_anonymous_usage_stats=None, single_threaded=False, state=PosixPath('/tmp'), static_parser=None, target='default', threads=None, use_colors=None, use_experimental_parser=None, vars='{}', version_check=None, warn_error=None, which='generate', write_json=None)
2022-06-14 03:12:52.078031 (MainThread): 03:12:52  Tracking: tracking
2022-06-14 03:12:52.078403 (MainThread): 03:12:52  Sending event: {'category': 'dbt', 'action': 'invocation', 'label': 'start', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x7f0163c996a0>, <snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x7f0163c99640>, <snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x7f0163c996d0>]}
2022-06-14 03:12:52.127801 (MainThread): 03:12:52  Partial parsing enabled: 0 files deleted, 0 files added, 0 files changed.
2022-06-14 03:12:52.128082 (MainThread): 03:12:52  Partial parsing enabled, no changes found, skipping parsing
2022-06-14 03:12:52.134129 (MainThread): 03:12:52  Sending event: {'category': 'dbt', 'action': 'load_project', 'label': '9818ab21-c14d-4622-967b-42f24d78524c', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x7f01632d1940>]}
2022-06-14 03:12:52.142119 (MainThread): 03:12:52  Sending event: {'category': 'dbt', 'action': 'resource_counts', 'label': '9818ab21-c14d-4622-967b-42f24d78524c', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x7f016339b730>]}
2022-06-14 03:12:52.142471 (MainThread): 03:12:52  Found 4 models, 0 tests, 0 snapshots, 0 analyses, 392 macros, 0 operations, 0 seed files, 0 sources, 0 exposures, 0 metrics
2022-06-14 03:12:52.144037 (MainThread): 03:12:52  
2022-06-14 03:12:52.144503 (MainThread): 03:12:52  Acquiring new snowflake connection "master"
2022-06-14 03:12:52.145488 (ThreadPoolExecutor-1_0): 03:12:52  Acquiring new snowflake connection "list_development_dbt_cloud_pr_94908_27_stg"
2022-06-14 03:12:52.145996 (ThreadPoolExecutor-1_1): 03:12:52  Acquiring new snowflake connection "list_development_dbt_cloud_pr_94908_27"
2022-06-14 03:12:52.159790 (ThreadPoolExecutor-1_0): 03:12:52  Using snowflake connection "list_development_dbt_cloud_pr_94908_27_stg"
2022-06-14 03:12:52.161522 (ThreadPoolExecutor-1_1): 03:12:52  Using snowflake connection "list_development_dbt_cloud_pr_94908_27"
2022-06-14 03:12:52.161791 (ThreadPoolExecutor-1_0): 03:12:52  On list_development_dbt_cloud_pr_94908_27_stg: /* {"app": "dbt", "dbt_version": "1.0.7", "profile_name": "user", "target_name": "default", "connection_name": "list_development_dbt_cloud_pr_94908_27_stg"} */

    show terse objects in development.dbt_cloud_pr_94908_27_stg
2022-06-14 03:12:52.162074 (ThreadPoolExecutor-1_1): 03:12:52  On list_development_dbt_cloud_pr_94908_27: /* {"app": "dbt", "dbt_version": "1.0.7", "profile_name": "user", "target_name": "default", "connection_name": "list_development_dbt_cloud_pr_94908_27"} */

    show terse objects in development.dbt_cloud_pr_94908_27
2022-06-14 03:12:52.162339 (ThreadPoolExecutor-1_0): 03:12:52  Opening a new connection, currently in state init
2022-06-14 03:12:52.162570 (ThreadPoolExecutor-1_1): 03:12:52  Opening a new connection, currently in state init
2022-06-14 03:12:53.062295 (ThreadPoolExecutor-1_1): 03:12:53  SQL status: SUCCESS 1 in 0.9 seconds
2022-06-14 03:12:53.064691 (ThreadPoolExecutor-1_1): 03:12:53  On list_development_dbt_cloud_pr_94908_27: Close
2022-06-14 03:12:53.150354 (ThreadPoolExecutor-1_0): 03:12:53  Snowflake adapter: Snowflake query id: 01a4eee0-0401-9536-000d-3783087bd7ce
2022-06-14 03:12:53.151288 (ThreadPoolExecutor-1_0): 03:12:53  Snowflake adapter: Snowflake error: 002043 (02000): SQL compilation error:
Object does not exist, or operation cannot be performed.
2022-06-14 03:12:53.152529 (ThreadPoolExecutor-1_0): 03:12:53  Snowflake adapter: Error running SQL: macro list_relations_without_caching
2022-06-14 03:12:53.152859 (ThreadPoolExecutor-1_0): 03:12:53  Snowflake adapter: Rolling back transaction.
2022-06-14 03:12:53.153138 (ThreadPoolExecutor-1_0): 03:12:53  On list_development_dbt_cloud_pr_94908_27_stg: Close
2022-06-14 03:12:53.366179 (MainThread): 03:12:53  Concurrency: 4 threads (target='default')
2022-06-14 03:12:53.366588 (MainThread): 03:12:53  
2022-06-14 03:12:53.368676 (Thread-27): 03:12:53  Began running node model.my_dbt_project.bar
2022-06-14 03:12:53.368987 (Thread-28): 03:12:53  Began running node model.my_dbt_project.baz
2022-06-14 03:12:53.369316 (Thread-29): 03:12:53  Began running node model.my_dbt_project.foo
2022-06-14 03:12:53.369781 (Thread-27): 03:12:53  Acquiring new snowflake connection "model.my_dbt_project.bar"
2022-06-14 03:12:53.370242 (Thread-28): 03:12:53  Acquiring new snowflake connection "model.my_dbt_project.baz"
2022-06-14 03:12:53.370738 (Thread-29): 03:12:53  Acquiring new snowflake connection "model.my_dbt_project.foo"
2022-06-14 03:12:53.371087 (Thread-27): 03:12:53  Began compiling node model.my_dbt_project.bar
2022-06-14 03:12:53.371357 (Thread-28): 03:12:53  Began compiling node model.my_dbt_project.baz
2022-06-14 03:12:53.371648 (Thread-29): 03:12:53  Began compiling node model.my_dbt_project.foo
2022-06-14 03:12:53.371937 (Thread-27): 03:12:53  Compiling model.my_dbt_project.bar
2022-06-14 03:12:53.372181 (Thread-28): 03:12:53  Compiling model.my_dbt_project.baz
2022-06-14 03:12:53.372417 (Thread-29): 03:12:53  Compiling model.my_dbt_project.foo
2022-06-14 03:12:53.375507 (Thread-28): 03:12:53  Writing injected SQL for node "model.my_dbt_project.baz"
2022-06-14 03:12:53.377113 (Thread-29): 03:12:53  Writing injected SQL for node "model.my_dbt_project.foo"
2022-06-14 03:12:53.377861 (Thread-28): 03:12:53  finished collecting timing info
2022-06-14 03:12:53.379434 (Thread-27): 03:12:53  Writing injected SQL for node "model.my_dbt_project.bar"
2022-06-14 03:12:53.379778 (Thread-28): 03:12:53  Began executing node model.my_dbt_project.baz
2022-06-14 03:12:53.379967 (Thread-29): 03:12:53  finished collecting timing info
2022-06-14 03:12:53.380405 (Thread-28): 03:12:53  finished collecting timing info
2022-06-14 03:12:53.380734 (Thread-29): 03:12:53  Began executing node model.my_dbt_project.foo
2022-06-14 03:12:53.381007 (Thread-27): 03:12:53  finished collecting timing info
2022-06-14 03:12:53.381551 (Thread-28): 03:12:53  Finished running node model.my_dbt_project.baz
2022-06-14 03:12:53.381810 (Thread-29): 03:12:53  finished collecting timing info
2022-06-14 03:12:53.382070 (Thread-27): 03:12:53  Began executing node model.my_dbt_project.bar
2022-06-14 03:12:53.382673 (Thread-29): 03:12:53  Finished running node model.my_dbt_project.foo
2022-06-14 03:12:53.382938 (Thread-27): 03:12:53  finished collecting timing info
2022-06-14 03:12:53.383476 (Thread-27): 03:12:53  Finished running node model.my_dbt_project.bar
2022-06-14 03:12:53.384162 (Thread-30): 03:12:53  Began running node model.my_dbt_project.my_union_model
2022-06-14 03:12:53.384563 (Thread-30): 03:12:53  Acquiring new snowflake connection "model.my_dbt_project.my_union_model"
2022-06-14 03:12:53.384754 (Thread-30): 03:12:53  Began compiling node model.my_dbt_project.my_union_model
2022-06-14 03:12:53.384916 (Thread-30): 03:12:53  Compiling model.my_dbt_project.my_union_model
2022-06-14 03:12:53.419689 (Thread-30): 03:12:53  Using snowflake connection "model.my_dbt_project.my_union_model"
2022-06-14 03:12:53.419929 (Thread-30): 03:12:53  On model.my_dbt_project.my_union_model: /* {"app": "dbt", "dbt_version": "1.0.7", "profile_name": "user", "target_name": "default", "node_id": "model.my_dbt_project.my_union_model"} */

    describe table development.dbt_cloud_pr_94908_27_stg.foo
2022-06-14 03:12:53.420119 (Thread-30): 03:12:53  Opening a new connection, currently in state init
2022-06-14 03:12:54.121929 (Thread-30): 03:12:54  Snowflake adapter: Snowflake query id: 01a4eee0-0401-9536-000d-3783087bd7d2
2022-06-14 03:12:54.122208 (Thread-30): 03:12:54  Snowflake adapter: Snowflake error: 002003 (02000): SQL compilation error:
Schema 'DEVELOPMENT.DBT_CLOUD_PR_94908_27_STG' does not exist or not authorized.
2022-06-14 03:12:54.122469 (Thread-30): 03:12:54  Snowflake adapter: Error running SQL: macro get_columns_in_relation
2022-06-14 03:12:54.122629 (Thread-30): 03:12:54  Snowflake adapter: Rolling back transaction.
2022-06-14 03:12:54.270366 (Thread-30): 03:12:54  Using snowflake connection "model.my_dbt_project.my_union_model"
2022-06-14 03:12:54.270614 (Thread-30): 03:12:54  On model.my_dbt_project.my_union_model: /* {"app": "dbt", "dbt_version": "1.0.7", "profile_name": "user", "target_name": "default", "node_id": "model.my_dbt_project.my_union_model"} */

    describe table development.dbt_cloud_pr_94908_27_stg.bar
2022-06-14 03:12:54.468793 (Thread-30): 03:12:54  Snowflake adapter: Snowflake query id: 01a4eee0-0401-9536-000d-3783087bd7d6
2022-06-14 03:12:54.469039 (Thread-30): 03:12:54  Snowflake adapter: Snowflake error: 002003 (02000): SQL compilation error:
Schema 'DEVELOPMENT.DBT_CLOUD_PR_94908_27_STG' does not exist or not authorized.
2022-06-14 03:12:54.469304 (Thread-30): 03:12:54  Snowflake adapter: Error running SQL: macro get_columns_in_relation
2022-06-14 03:12:54.469461 (Thread-30): 03:12:54  Snowflake adapter: Rolling back transaction.
2022-06-14 03:12:54.471096 (Thread-30): 03:12:54  finished collecting timing info
2022-06-14 03:12:54.471367 (Thread-30): 03:12:54  On model.my_dbt_project.my_union_model: Close
2022-06-14 03:12:54.699652 (Thread-30): 03:12:54  Compilation Error in model my_union_model (models/publish/my_union_model.sql)
  There were no columns found to union for relations foo, bar
  
  > in macro default__union_relations (macros/sql/union.sql)
  > called by macro union_relations (macros/sql/union.sql)
  > called by model my_union_model (models/publish/my_union_model.sql)
  > called by model my_union_model (models/publish/my_union_model.sql)
2022-06-14 03:12:54.700064 (Thread-30): 03:12:54  Finished running node model.my_dbt_project.my_union_model
2022-06-14 03:12:54.700708 (MainThread): 03:12:54  Connection 'master' was properly closed.
2022-06-14 03:12:54.701128 (MainThread): 03:12:54  Connection 'model.my_dbt_project.baz' was properly closed.
2022-06-14 03:12:54.701291 (MainThread): 03:12:54  Connection 'model.my_dbt_project.bar' was properly closed.
2022-06-14 03:12:54.701450 (MainThread): 03:12:54  Connection 'model.my_dbt_project.foo' was properly closed.
2022-06-14 03:12:54.701595 (MainThread): 03:12:54  Connection 'model.my_dbt_project.my_union_model' was properly closed.
2022-06-14 03:12:54.701885 (MainThread): 03:12:54  Sending event: {'category': 'dbt', 'action': 'invocation', 'label': 'end', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x7f01680da640>, <snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x7f01680daca0>, <snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x7f01680dab80>]}
2022-06-14 03:12:54.702217 (MainThread): 03:12:54  Flushing usage events
2022-06-14 03:12:54.725730 (MainThread): Runtime Error
  Compilation Error in model my_union_model (models/publish/my_union_model.sql)
    There were no columns found to union for relations foo, bar
    
    > in macro default__union_relations (macros/sql/union.sql)
    > called by macro union_relations (macros/sql/union.sql)
    > called by model my_union_model (models/publish/my_union_model.sql)
    > called by model my_union_model (models/publish/my_union_model.sql)

System information

The contents of your packages.yml file:

packages:
  - package: dbt-labs/dbt_utils
    version: 0.8.4

Which database are you using dbt with?

  • postgres
  • redshift
  • bigquery
  • snowflake
  • other (specify: ____________)

The output of dbt --version:

1.0.7 in dbt Cloud

Additional context

Initially I thought that the log lines:

Schema 'DEVELOPMENT.DBT_CLOUD_PR_94908_27_STG' does not exist or not authorized.

contributed to this - but if you actually do not use the exclude param - e.g.

-- models/publish/my_union_model.sql
with all_models as (
    {{
        dbt_utils.union_relations(
            relations = [ref('foo'), ref('bar')]
        )
    }}
)

select * from all_models

You still see the same lines logged BUT there is no error emitted. This is an example log of just that happening:

2022-06-14 02:45:36.874225 (MainThread): 02:45:36  Running with dbt=1.0.7
2022-06-14 02:45:36.874823 (MainThread): 02:45:36  running dbt with arguments Namespace(cls=<class 'dbt.task.generate.GenerateTask'>, compile=True, debug=None, defer=None, event_buffer_size=None, exclude=None, fail_fast=None, log_cache_events=False, log_format=None, partial_parse=None, printer_width=None, profile='user', profiles_dir='/tmp/jobs/62525334/.dbt', project_dir=None, record_timing_info=None, rpc_method='docs.generate', select=None, selector_name=None, send_anonymous_usage_stats=None, single_threaded=False, state=PosixPath('/tmp'), static_parser=None, target='default', threads=None, use_colors=None, use_experimental_parser=None, vars='{}', version_check=None, warn_error=None, which='generate', write_json=None)
2022-06-14 02:45:36.875014 (MainThread): 02:45:36  Tracking: tracking
2022-06-14 02:45:36.875413 (MainThread): 02:45:36  Sending event: {'category': 'dbt', 'action': 'invocation', 'label': 'start', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x7f08bcbb6ee0>, <snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x7f08bcbb6e80>, <snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x7f08bcbb6f10>]}
2022-06-14 02:45:36.925386 (MainThread): 02:45:36  Partial parsing enabled: 0 files deleted, 0 files added, 0 files changed.
2022-06-14 02:45:36.925707 (MainThread): 02:45:36  Partial parsing enabled, no changes found, skipping parsing
2022-06-14 02:45:36.932264 (MainThread): 02:45:36  Sending event: {'category': 'dbt', 'action': 'load_project', 'label': 'cba5840a-d584-4e43-a087-524062821736', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x7f08bc955550>]}
2022-06-14 02:45:36.940814 (MainThread): 02:45:36  Sending event: {'category': 'dbt', 'action': 'resource_counts', 'label': 'cba5840a-d584-4e43-a087-524062821736', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x7f08bca1b250>]}
2022-06-14 02:45:36.941162 (MainThread): 02:45:36  Found 4 models, 1 test, 0 snapshots, 0 analyses, 392 macros, 0 operations, 0 seed files, 0 sources, 0 exposures, 0 metrics
2022-06-14 02:45:36.942671 (MainThread): 02:45:36  
2022-06-14 02:45:36.943174 (MainThread): 02:45:36  Acquiring new snowflake connection "master"
2022-06-14 02:45:36.944301 (ThreadPoolExecutor-1_0): 02:45:36  Acquiring new snowflake connection "list_development_dbt_cloud_pr_94908_25_stg"
2022-06-14 02:45:36.944819 (ThreadPoolExecutor-1_1): 02:45:36  Acquiring new snowflake connection "list_development_dbt_cloud_pr_94908_25"
2022-06-14 02:45:36.958614 (ThreadPoolExecutor-1_0): 02:45:36  Using snowflake connection "list_development_dbt_cloud_pr_94908_25_stg"
2022-06-14 02:45:36.960409 (ThreadPoolExecutor-1_1): 02:45:36  Using snowflake connection "list_development_dbt_cloud_pr_94908_25"
2022-06-14 02:45:36.960684 (ThreadPoolExecutor-1_0): 02:45:36  On list_development_dbt_cloud_pr_94908_25_stg: /* {"app": "dbt", "dbt_version": "1.0.7", "profile_name": "user", "target_name": "default", "connection_name": "list_development_dbt_cloud_pr_94908_25_stg"} */

    show terse objects in development.dbt_cloud_pr_94908_25_stg
2022-06-14 02:45:36.960974 (ThreadPoolExecutor-1_1): 02:45:36  On list_development_dbt_cloud_pr_94908_25: /* {"app": "dbt", "dbt_version": "1.0.7", "profile_name": "user", "target_name": "default", "connection_name": "list_development_dbt_cloud_pr_94908_25"} */

    show terse objects in development.dbt_cloud_pr_94908_25
2022-06-14 02:45:36.961216 (ThreadPoolExecutor-1_0): 02:45:36  Opening a new connection, currently in state init
2022-06-14 02:45:36.961439 (ThreadPoolExecutor-1_1): 02:45:36  Opening a new connection, currently in state init
2022-06-14 02:45:37.890424 (ThreadPoolExecutor-1_1): 02:45:37  SQL status: SUCCESS 1 in 0.93 seconds
2022-06-14 02:45:37.893008 (ThreadPoolExecutor-1_1): 02:45:37  On list_development_dbt_cloud_pr_94908_25: Close
2022-06-14 02:45:37.938899 (ThreadPoolExecutor-1_0): 02:45:37  Snowflake adapter: Snowflake query id: 01a4eec5-0401-9535-000d-3783087bcbaa
2022-06-14 02:45:37.939162 (ThreadPoolExecutor-1_0): 02:45:37  Snowflake adapter: Snowflake error: 002043 (02000): SQL compilation error:
Object does not exist, or operation cannot be performed.
2022-06-14 02:45:37.939506 (ThreadPoolExecutor-1_0): 02:45:37  Snowflake adapter: Error running SQL: macro list_relations_without_caching
2022-06-14 02:45:37.939682 (ThreadPoolExecutor-1_0): 02:45:37  Snowflake adapter: Rolling back transaction.
2022-06-14 02:45:37.939944 (ThreadPoolExecutor-1_0): 02:45:37  On list_development_dbt_cloud_pr_94908_25_stg: Close
2022-06-14 02:45:38.120193 (MainThread): 02:45:38  Concurrency: 4 threads (target='default')
2022-06-14 02:45:38.120627 (MainThread): 02:45:38  
2022-06-14 02:45:38.122767 (Thread-27): 02:45:38  Began running node model.my_dbt_project.bar
2022-06-14 02:45:38.123085 (Thread-28): 02:45:38  Began running node model.my_dbt_project.baz
2022-06-14 02:45:38.123444 (Thread-29): 02:45:38  Began running node model.my_dbt_project.foo
2022-06-14 02:45:38.123891 (Thread-27): 02:45:38  Acquiring new snowflake connection "model.my_dbt_project.bar"
2022-06-14 02:45:38.124357 (Thread-28): 02:45:38  Acquiring new snowflake connection "model.my_dbt_project.baz"
2022-06-14 02:45:38.124871 (Thread-29): 02:45:38  Acquiring new snowflake connection "model.my_dbt_project.foo"
2022-06-14 02:45:38.125189 (Thread-27): 02:45:38  Began compiling node model.my_dbt_project.bar
2022-06-14 02:45:38.125469 (Thread-28): 02:45:38  Began compiling node model.my_dbt_project.baz
2022-06-14 02:45:38.125729 (Thread-29): 02:45:38  Began compiling node model.my_dbt_project.foo
2022-06-14 02:45:38.126065 (Thread-27): 02:45:38  Compiling model.my_dbt_project.bar
2022-06-14 02:45:38.126342 (Thread-28): 02:45:38  Compiling model.my_dbt_project.baz
2022-06-14 02:45:38.126634 (Thread-29): 02:45:38  Compiling model.my_dbt_project.foo
2022-06-14 02:45:38.129773 (Thread-28): 02:45:38  Writing injected SQL for node "model.my_dbt_project.baz"
2022-06-14 02:45:38.131350 (Thread-27): 02:45:38  Writing injected SQL for node "model.my_dbt_project.bar"
2022-06-14 02:45:38.132938 (Thread-29): 02:45:38  Writing injected SQL for node "model.my_dbt_project.foo"
2022-06-14 02:45:38.133704 (Thread-28): 02:45:38  finished collecting timing info
2022-06-14 02:45:38.134165 (Thread-28): 02:45:38  Began executing node model.my_dbt_project.baz
2022-06-14 02:45:38.134368 (Thread-27): 02:45:38  finished collecting timing info
2022-06-14 02:45:38.134653 (Thread-29): 02:45:38  finished collecting timing info
2022-06-14 02:45:38.134899 (Thread-28): 02:45:38  finished collecting timing info
2022-06-14 02:45:38.135228 (Thread-27): 02:45:38  Began executing node model.my_dbt_project.bar
2022-06-14 02:45:38.135517 (Thread-29): 02:45:38  Began executing node model.my_dbt_project.foo
2022-06-14 02:45:38.136041 (Thread-28): 02:45:38  Finished running node model.my_dbt_project.baz
2022-06-14 02:45:38.136379 (Thread-27): 02:45:38  finished collecting timing info
2022-06-14 02:45:38.136633 (Thread-29): 02:45:38  finished collecting timing info
2022-06-14 02:45:38.137252 (Thread-27): 02:45:38  Finished running node model.my_dbt_project.bar
2022-06-14 02:45:38.137702 (Thread-29): 02:45:38  Finished running node model.my_dbt_project.foo
2022-06-14 02:45:38.138507 (Thread-30): 02:45:38  Began running node model.my_dbt_project.my_union_model
2022-06-14 02:45:38.138920 (Thread-30): 02:45:38  Acquiring new snowflake connection "model.my_dbt_project.my_union_model"
2022-06-14 02:45:38.139110 (Thread-30): 02:45:38  Began compiling node model.my_dbt_project.my_union_model
2022-06-14 02:45:38.139269 (Thread-30): 02:45:38  Compiling model.my_dbt_project.my_union_model
2022-06-14 02:45:38.175072 (Thread-30): 02:45:38  Using snowflake connection "model.my_dbt_project.my_union_model"
2022-06-14 02:45:38.175355 (Thread-30): 02:45:38  On model.my_dbt_project.my_union_model: /* {"app": "dbt", "dbt_version": "1.0.7", "profile_name": "user", "target_name": "default", "node_id": "model.my_dbt_project.my_union_model"} */

    describe table development.dbt_cloud_pr_94908_25_stg.foo
2022-06-14 02:45:38.175549 (Thread-30): 02:45:38  Opening a new connection, currently in state init
2022-06-14 02:45:39.040401 (Thread-30): 02:45:39  Snowflake adapter: Snowflake query id: 01a4eec5-0401-9535-000d-3783087bcbae
2022-06-14 02:45:39.040800 (Thread-30): 02:45:39  Snowflake adapter: Snowflake error: 002003 (02000): SQL compilation error:
Schema 'DEVELOPMENT.DBT_CLOUD_PR_94908_25_STG' does not exist or not authorized.
2022-06-14 02:45:39.041228 (Thread-30): 02:45:39  Snowflake adapter: Error running SQL: macro get_columns_in_relation
2022-06-14 02:45:39.041484 (Thread-30): 02:45:39  Snowflake adapter: Rolling back transaction.
2022-06-14 02:45:39.045393 (Thread-30): 02:45:39  Using snowflake connection "model.my_dbt_project.my_union_model"
2022-06-14 02:45:39.045697 (Thread-30): 02:45:39  On model.my_dbt_project.my_union_model: /* {"app": "dbt", "dbt_version": "1.0.7", "profile_name": "user", "target_name": "default", "node_id": "model.my_dbt_project.my_union_model"} */

    describe table development.dbt_cloud_pr_94908_25_stg.bar
2022-06-14 02:45:39.216198 (Thread-30): 02:45:39  Snowflake adapter: Snowflake query id: 01a4eec5-0401-9536-000d-3783087bd55e
2022-06-14 02:45:39.216471 (Thread-30): 02:45:39  Snowflake adapter: Snowflake error: 002003 (02000): SQL compilation error:
Schema 'DEVELOPMENT.DBT_CLOUD_PR_94908_25_STG' does not exist or not authorized.
2022-06-14 02:45:39.216738 (Thread-30): 02:45:39  Snowflake adapter: Error running SQL: macro get_columns_in_relation
2022-06-14 02:45:39.216899 (Thread-30): 02:45:39  Snowflake adapter: Rolling back transaction.
2022-06-14 02:45:39.222834 (Thread-30): 02:45:39  Writing injected SQL for node "model.my_dbt_project.my_union_model"
2022-06-14 02:45:39.223352 (Thread-30): 02:45:39  finished collecting timing info
2022-06-14 02:45:39.223591 (Thread-30): 02:45:39  Began executing node model.my_dbt_project.my_union_model
2022-06-14 02:45:39.223773 (Thread-30): 02:45:39  finished collecting timing info
2022-06-14 02:45:39.223974 (Thread-30): 02:45:39  On model.my_dbt_project.my_union_model: Close
2022-06-14 02:45:39.411156 (Thread-30): 02:45:39  Finished running node model.my_dbt_project.my_union_model
2022-06-14 02:45:39.412098 (Thread-27): 02:45:39  Began running node test.my_dbt_project.not_null_my_union_model_user_id.4b7cc2cd04
2022-06-14 02:45:39.412569 (Thread-27): 02:45:39  Acquiring new snowflake connection "test.my_dbt_project.not_null_my_union_model_user_id.4b7cc2cd04"
2022-06-14 02:45:39.412788 (Thread-27): 02:45:39  Began compiling node test.my_dbt_project.not_null_my_union_model_user_id.4b7cc2cd04
2022-06-14 02:45:39.412965 (Thread-27): 02:45:39  Compiling test.my_dbt_project.not_null_my_union_model_user_id.4b7cc2cd04
2022-06-14 02:45:39.424047 (Thread-27): 02:45:39  Writing injected SQL for node "test.my_dbt_project.not_null_my_union_model_user_id.4b7cc2cd04"
2022-06-14 02:45:39.424495 (Thread-27): 02:45:39  finished collecting timing info
2022-06-14 02:45:39.424724 (Thread-27): 02:45:39  Began executing node test.my_dbt_project.not_null_my_union_model_user_id.4b7cc2cd04
2022-06-14 02:45:39.424908 (Thread-27): 02:45:39  finished collecting timing info
2022-06-14 02:45:39.425263 (Thread-27): 02:45:39  Finished running node test.my_dbt_project.not_null_my_union_model_user_id.4b7cc2cd04
2022-06-14 02:45:39.426264 (MainThread): 02:45:39  Connection 'master' was properly closed.
2022-06-14 02:45:39.426478 (MainThread): 02:45:39  Connection 'test.my_dbt_project.not_null_my_union_model_user_id.4b7cc2cd04' was properly closed.
2022-06-14 02:45:39.426634 (MainThread): 02:45:39  Connection 'model.my_dbt_project.baz' was properly closed.
2022-06-14 02:45:39.426778 (MainThread): 02:45:39  Connection 'model.my_dbt_project.foo' was properly closed.
2022-06-14 02:45:39.426919 (MainThread): 02:45:39  Connection 'model.my_dbt_project.my_union_model' was properly closed.
2022-06-14 02:45:39.435369 (MainThread): 02:45:39  Done.
2022-06-14 02:45:39.437154 (MainThread): 02:45:39  Acquiring new snowflake connection "generate_catalog"
2022-06-14 02:45:39.437366 (MainThread): 02:45:39  Building catalog
2022-06-14 02:45:39.438495 (ThreadPoolExecutor-2_0): 02:45:39  Acquiring new snowflake connection "development.information_schema"
2022-06-14 02:45:39.445779 (ThreadPoolExecutor-2_0): 02:45:39  Using snowflake connection "development.information_schema"
2022-06-14 02:45:39.445988 (ThreadPoolExecutor-2_0): 02:45:39  On development.information_schema: /* {"app": "dbt", "dbt_version": "1.0.7", "profile_name": "user", "target_name": "default", "connection_name": "development.information_schema"} */

    
      with tables as (

          select
              table_catalog as "table_database",
              table_schema as "table_schema",
              table_name as "table_name",
              table_type as "table_type",
              comment as "table_comment",

              -- note: this is the _role_ that owns the table
              table_owner as "table_owner",

              'Clustering Key' as "stats:clustering_key:label",
              clustering_key as "stats:clustering_key:value",
              'The key used to cluster this table' as "stats:clustering_key:description",
              (clustering_key is not null) as "stats:clustering_key:include",

              'Row Count' as "stats:row_count:label",
              row_count as "stats:row_count:value",
              'An approximate count of rows in this table' as "stats:row_count:description",
              (row_count is not null) as "stats:row_count:include",

              'Approximate Size' as "stats:bytes:label",
              bytes as "stats:bytes:value",
              'Approximate size of the table as reported by Snowflake' as "stats:bytes:description",
              (bytes is not null) as "stats:bytes:include",

              'Last Modified' as "stats:last_modified:label",
              to_varchar(convert_timezone('UTC', last_altered), 'yyyy-mm-dd HH24:MI'||'UTC') as "stats:last_modified:value",
              'The timestamp for last update/change' as "stats:last_modified:description",
              (last_altered is not null and table_type='BASE TABLE') as "stats:last_modified:include"

          from development.INFORMATION_SCHEMA.tables

      ),

      columns as (

          select
              table_catalog as "table_database",
              table_schema as "table_schema",
              table_name as "table_name",

              column_name as "column_name",
              ordinal_position as "column_index",
              data_type as "column_type",
              comment as "column_comment"

          from development.INFORMATION_SCHEMA.columns
      )

      select *
      from tables
      join columns using ("table_database", "table_schema", "table_name")
      where (upper("table_schema") = upper('dbt_cloud_pr_94908_25_stg') or upper("table_schema") = upper('dbt_cloud_pr_94908_25'))
      order by "column_index"
2022-06-14 02:45:39.446181 (ThreadPoolExecutor-2_0): 02:45:39  Opening a new connection, currently in state init
2022-06-14 02:45:41.782646 (ThreadPoolExecutor-2_0): 02:45:41  SQL status: SUCCESS 1 in 2.34 seconds
2022-06-14 02:45:41.789754 (ThreadPoolExecutor-2_0): 02:45:41  On development.information_schema: Close
2022-06-14 02:45:41.987080 (MainThread): 02:45:41  Catalog written to /tmp/jobs/62525334/target/dbt/snowflake/target/catalog.json
2022-06-14 02:45:41.987581 (MainThread): 02:45:41  Sending event: {'category': 'dbt', 'action': 'invocation', 'label': 'end', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x7f091a2cab20>, <snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x7f091a2cabe0>, <snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x7f091a2cad00>]}
2022-06-14 02:45:41.987917 (MainThread): 02:45:41  Flushing usage events
2022-06-14 02:45:42.008211 (MainThread): 02:45:42  Connection 'generate_catalog' was properly closed.
2022-06-14 02:45:42.008493 (MainThread): 02:45:42  Connection 'development.information_schema' was properly closed.

Are you interested in contributing the fix?

Sure - haven't fully digested why yet though.

@jeremyyeo jeremyyeo added bug Something isn't working triage labels Jun 14, 2022
@jeremyyeo
Copy link
Contributor Author

jeremyyeo commented Jun 14, 2022

{% if (include | length > 0 or exclude | length > 0) and not column_superset.keys() %}
{%- set relations_string -%}
{%- for relation in relations -%}
{{ relation.name }}
{%- if not loop.last %}, {% endif -%}
{%- endfor -%}
{%- endset -%}
{%- set error_message -%}
There were no columns found to union for relations {{ relations_string }}
{%- endset -%}
{{ exceptions.raise_compiler_error(error_message) }}
{%- endif -%}

Looks like we raise the error_message here when exclude is not of length 0 and there are no columns in the superset (there are no columns in the superset because the relations weren't created in the PR schema as they were not in the state:modified selection). If we don't use exclude / include than we don't run into this path.

Something like this could do the trick... is this right or not... 🤔

        {%- if flags.WHICH != 'generate' -%}
            {{ exceptions.raise_compiler_error(error_message) }}
        {%- endif -%}
 {%- endif -%} 

@erika-e
Copy link

erika-e commented Jun 14, 2022

🙌 Thanks for the thorough writeup, I'm the person who originally reported this bug to support! Following the issue for updates, but it seems like if we drop the excludes we can get docs back on CI runs?

@dbeatty10
Copy link
Contributor

Agreed, incredible investigation and write-up @jeremyyeo 🏅

Following the examples you provided, I was also able to reproduce this outside of dbt Cloud and CI.

The short story

  • the 2nd CI run doesn't have access to the unchanged tables/views -- those live in a schema only accessible to the 1st CI run
  • so when dbt_utils.union_relations is executed during dbt docs generate, there are no columns found and it raises the error

Successful solution

Your solution based on flags.WHICH != 'generate' also worked for me. 🙌

Side note: looks like flags.WHICH is planned to be renamed in the future.

Potential implementation

This particular macro has been the subject of a handful of issues and PRs recently, so we'll want to be thoughtful about the final implementation to avoid an accidental regression. At the same time, it would be nice to not just layer on more patch work in this section of code.

Recent history of issues and pulls related to this issue: (#266, #473) and (#505, #509).

Seemingly unrelated open issues for union_relations: #173, #202, #331, #395, and (#578, #587).

Merely doing something like the following might have solved #266:

select
    cast({{ dbt_utils.string_literal(relation) }} as {{ dbt_utils.type_string() }}) as {{ source_column_name }}{% if ordered_column_names %},{% endif -%}

If we want to continue raising errors in certain circumstances, I suspect we could do something like the following:

    {#-- If only compiling or generating documentation, it's okay if no columns are found -#}
    {%- set dbt_command = flags.WHICH -%}
    {% if dbt_command in ['run', 'build'] and not column_superset.keys() %}
        {%- set relations_string -%}
            {%- for relation in relations -%}
                {{ relation.name }}
            {%- if not loop.last %}, {% endif -%}
            {%- endfor -%}
        {%- endset -%}

        {%- set error_message -%}
            There were no columns found to union for relations {{ relations_string }}
        {%- endset -%}

        {{ exceptions.raise_compiler_error(error_message) }}
    {%- endif -%}

Ultimately, we might want to do both.

Main concern

I'm not aware of any tests that cover the original errors, so it will be harder to be confident that we're avoiding any regressions. I think (hope?) we can reason our way through this though.

@dbeatty10
Copy link
Contributor

This (presumed) fix went out in version 0.8.6 -- hopefully this fixes things for you @erika-e. Please let me know if not.

@barberscott
Copy link

barberscott commented Jun 16, 2022

Good choice on not using != 'generate' in your eventual solution since in rpc mode it's flagged as rpc which is....unexpected.

@dbeatty10
Copy link
Contributor

Good call-out about rpc, @barberscott ! Didn't realize this -- thanks for educating me about flags.WHICH == "rpc" when in rpc mode.

Do you know if there's any way to differentiate run vs. test, generate, etc when in rpc mode?
Do you know if this behavior will change when the next-generation dbt Server is released?

@jeremyyeo
Copy link
Contributor Author

Forgot about the rpc flags but copying a convo between Scott and I the last time we looked at this:

Macro like this:

{% macro log_me() %}
    {% if flags.WHICH == 'rpc' %}
        {% do run_query("insert into development.dbt_jyeo.my_test values('test')") %}
    {% endif %}
{% endmacro %}

In a hook:

on-run-start: 
  - "{{ log_me() }}"

Tested in IDE:

dbt run
dbt test
dbt run-operation
dbt compile
dbt docs generate

Only dbt docs generate inserted a row into my_test table.... meaning flags.WHICH == 'rpc' appear to be only true when the command is dbt docs generate.

The reset should be what their command actually is. Totes weird.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working triage
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants