Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dbt build #3490

Merged
merged 6 commits into from
Jul 20, 2021
Merged

dbt build #3490

merged 6 commits into from
Jul 20, 2021

Conversation

iknox-fa
Copy link
Contributor

@iknox-fa iknox-fa commented Jun 22, 2021

resolves #2743

Description

Adds a dbt build command that does "all the things" for a given project. More specifically:

  • models invoke logic from dbt run
  • tests invoke logic from dbt test
  • seeds invoke logic from dbt seed
  • snapshots invoke logic from dbt snapshot

This is all handled in dependency order. Test failures block further downstream processing.
Test failure blocking was de-scoped due to testing issues.

Checklist

  • I have signed the CLA
  • I have run this code in development and it appears to resolve the stated issue
  • This PR includes tests, or tests are not required/relevant for this PR
  • I have updated the CHANGELOG.md and added information about my change to the "dbt next" section.

@cla-bot cla-bot bot added the cla:yes label Jun 22, 2021
@iknox-fa iknox-fa changed the title dbt build, cli only dbt build Jun 22, 2021
@iknox-fa iknox-fa temporarily deployed to Postgres June 22, 2021 23:59 Inactive
@iknox-fa iknox-fa temporarily deployed to Bigquery June 22, 2021 23:59 Inactive
@iknox-fa iknox-fa temporarily deployed to Bigquery June 22, 2021 23:59 Inactive
@iknox-fa iknox-fa temporarily deployed to Snowflake June 22, 2021 23:59 Inactive
@iknox-fa iknox-fa temporarily deployed to Snowflake June 22, 2021 23:59 Inactive
@iknox-fa iknox-fa temporarily deployed to Redshift June 22, 2021 23:59 Inactive
@iknox-fa iknox-fa temporarily deployed to Redshift June 23, 2021 00:00 Inactive
Copy link
Contributor

@jtcohen6 jtcohen6 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work so far!

Test failures block further downstream processing.

Does this PR include the logic for skipping a downstream model, if a test on an upstream model fails? I didn't see that happen when I took for a spin locally.

core/dbt/graph/queue.py Outdated Show resolved Hide resolved
core/dbt/main.py Outdated Show resolved Hide resolved
Copy link
Contributor

@nathaniel-may nathaniel-may left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! I like that it ended up being a nice and concise change. What I can't quite tell from the code is where the ordering happens, and what happens when a test fails. Maybe we can add some comments to that effect?

core/dbt/task/build.py Outdated Show resolved Hide resolved
@nathaniel-may
Copy link
Contributor

Do we have an integration test that hits all the types with the build command?

@iknox-fa iknox-fa temporarily deployed to Postgres July 1, 2021 15:36 Inactive
@iknox-fa iknox-fa temporarily deployed to Snowflake July 1, 2021 15:36 Inactive
@iknox-fa iknox-fa temporarily deployed to Snowflake July 1, 2021 15:36 Inactive
@iknox-fa iknox-fa temporarily deployed to Bigquery July 1, 2021 15:36 Inactive
@iknox-fa iknox-fa temporarily deployed to Bigquery July 1, 2021 15:36 Inactive
@iknox-fa iknox-fa temporarily deployed to Redshift July 1, 2021 15:36 Inactive
@iknox-fa iknox-fa temporarily deployed to Redshift July 1, 2021 15:36 Inactive
@iknox-fa iknox-fa temporarily deployed to Postgres July 5, 2021 17:40 Inactive
@iknox-fa iknox-fa temporarily deployed to Bigquery July 5, 2021 17:40 Inactive
@iknox-fa iknox-fa temporarily deployed to Bigquery July 5, 2021 17:40 Inactive
@iknox-fa iknox-fa temporarily deployed to Redshift July 5, 2021 17:40 Inactive
@iknox-fa iknox-fa temporarily deployed to Redshift July 5, 2021 17:40 Inactive
@iknox-fa iknox-fa temporarily deployed to Snowflake July 9, 2021 16:36 Inactive
@iknox-fa iknox-fa temporarily deployed to Redshift July 9, 2021 16:36 Inactive
@iknox-fa iknox-fa temporarily deployed to Redshift July 9, 2021 16:36 Inactive
Copy link
Contributor

@jtcohen6 jtcohen6 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

substance

  • Artifact output looks great! (I spent a few minutes visually inspecting run_results.json.) Given that tests can now be skipped (if their upstream model fails), for completeness' sake, we should probably adjust the TestStatus contract accordingly.
  • Ultimately, I'd like dbt build to use --select rather than --models. I'd actually like to make that switch across the board v0.21 (Remove the distinction between --select and --models flags when working with node types #3210). I don't want to block this PR any more than necessary, so as long as that feels like a simple thing to come later, I'm happy moving forward with it as is.

process

  • Do we want new issues for test-fail-block-downstream and RPC/CLI flag parity, or to keep using [Tracking] dbt build command #3452 for this?
  • Could we get a WIP draft PR for the test-blocking feature (since we got so close, and may just need to debug a few integration tests)?

CHANGELOG.md Outdated Show resolved Hide resolved
core/dbt/task/build.py Show resolved Hide resolved
Copy link
Contributor

@jtcohen6 jtcohen6 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I should've caught this above. I was just testing this a bit more, and I ran into an unexpected behavior. If I cause one model to fail (by introducing a syntax error), rather than continuing on and skipping downstream nodes, the error results in the end of the invocation—no more nodes run. This is the behavior I'd expect from --fail-fast, but not from a standard build invocation without supplying that flag.

Here's the log output of dbt build (without --fail-fast):

00:45:53 | 1 of 12 ERROR creating view model dbt_jcohen.stg_trello_boards....... [ERROR in 2.43s]
Encountered an error:
00:45:55 | 5 of 12 OK created view model dbt_jcohen.stg_trello_members.......... [SUCCESS 1 in 4.54s]
Runtime Error
  Database Error in model stg_trello_boards (models/staging/trello/stg_trello_boards.sql)
    001003 (42000): SQL compilation error:
    syntax error line 22 at position 4 unexpected 'a'.
    parse error line 27 at position 4 near '<EOF>'.
    syntax error line 26 at position 0 unexpected 'select'.
    compiled SQL at target/run/fishtown_internal_analytics/models/staging/trello/stg_trello_boards.sql
00:45:55 | 4 of 12 OK created view model dbt_jcohen.stg_trello_cards............ [SUCCESS 1 in 4.54s]
00:45:55 | 3 of 12 ERROR creating view model dbt_jcohen.stg_trello_card_members. [ERROR in 4.55s]
00:45:55 | 2 of 12 ERROR creating view model dbt_jcohen.stg_trello_card_labels.. [ERROR in 4.55s]

Here's dbt build --fail-fast:

00:46:13 | 1 of 12 ERROR creating view model dbt_jcohen.stg_trello_boards....... [ERROR in 2.49s]
00:46:15 | CANCEL query model.fishtown_internal_analytics.stg_trello_boards..... [CANCEL]
00:46:15 | CANCEL query model.fishtown_internal_analytics.stg_trello_card_labels [CANCEL]
00:46:15 | CANCEL query model.fishtown_internal_analytics.stg_trello_card_members [CANCEL]
00:46:15 | CANCEL query model.fishtown_internal_analytics.stg_trello_cards...... [CANCEL]
00:46:15 | CANCEL query model.fishtown_internal_analytics.stg_trello_members.... [CANCEL]
00:46:15 | 2 of 12 OK created view model dbt_jcohen.stg_trello_card_labels...... [SUCCESS 1 in 4.73s]
00:46:15 | 3 of 12 OK created view model dbt_jcohen.stg_trello_card_members..... [SUCCESS 1 in 4.73s]
00:46:16 | 5 of 12 ERROR creating view model dbt_jcohen.stg_trello_members...... [ERROR in 4.92s]
00:46:16 | 4 of 12 ERROR creating view model dbt_jcohen.stg_trello_cards........ [ERROR in 4.94s]

Database Error in model stg_trello_boards (models/staging/trello/stg_trello_boards.sql)
  001003 (42000): SQL compilation error:

@iknox-fa iknox-fa force-pushed the feature/dbt-build branch 2 times, most recently from ea71230 to 51b5175 Compare July 14, 2021 18:23
@iknox-fa iknox-fa temporarily deployed to Postgres July 14, 2021 18:29 Inactive
@iknox-fa iknox-fa temporarily deployed to Redshift July 14, 2021 18:29 Inactive
@iknox-fa iknox-fa temporarily deployed to Redshift July 14, 2021 18:29 Inactive
@iknox-fa iknox-fa temporarily deployed to Snowflake July 14, 2021 18:29 Inactive
@iknox-fa iknox-fa temporarily deployed to Snowflake July 14, 2021 18:29 Inactive
@iknox-fa iknox-fa temporarily deployed to Bigquery July 14, 2021 18:29 Inactive
@iknox-fa iknox-fa temporarily deployed to Bigquery July 14, 2021 18:29 Inactive
Copy link
Contributor

@kwigley kwigley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good!

core/dbt/graph/queue.py Show resolved Hide resolved
Copy link
Contributor

@nathaniel-may nathaniel-may left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🐑

Copy link
Contributor

@jtcohen6 jtcohen6 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We had a chance to catch up on the behavior above and discuss next steps. For now, I'm conditionally approving this PR, with the understanding that we'll be opening new issues to track the pieces that are missing:

  • [critical] On node failure: correctly write run_results.json (missing today); parity with dbt run for logging (showing skipped + run summary)
  • [highly desirable] Tests block other node children, test failures skip those children
  • [completeness] CLI flag parity with existing commands
  • [completeness] RPC build method

@iknox-fa Could you please link to those issues here after merging?

Excited to get the first rough cut of dbt build in for v0.21.0b1!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Generalized dbt build command
5 participants