Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initialize lift + shift for cross-db macros #594

Closed
wants to merge 3 commits into from

Conversation

jtcohen6
Copy link
Contributor

@jtcohen6 jtcohen6 commented May 18, 2022

This is a:

  • documentation update
  • bug fix with no breaking changes
  • new functionality
  • a breaking change
  • behind-the-scenes refactor

Description & motivation

See:

Bonus!

Install test cases for dateadd + datediff from dbt-tests-adapter, override just enough to verify backwards compatibility when the macros are called as dbt_utils.dateadd and dbt_utils.datediff.

How much backwards compatibility do we care about? There are three options:

  1. Zero backwards compatibility for dbt-core < 1.2. We cut a new minor version of dbt-utils with require-dbt-version: [">=1.2.0", "<2.0.0"]
  2. Some backwards compatibility. If someone is using dbt-core >=1.2, use the core macros; otherwise, keep the implementations around. (This requires duplicating code between dbt-core and dbt-utils.) Calling dbt_utils.dateadd + dbt_utils.datediff will keep working, but some custom dispatch rules won't keep working after upgrading to v1.2, since we're only dispatching on older versions.
  3. Total backwards compatibility, including support for custom dispatch rules. This requires duplicating some pretty gross version-checking logic into every single adapter-specific implementation. At this point... is there any sense actually doing this at all? Are we better off just totally duplicating the code?

My preference would be for option 1 or 2. If we go with option 2, we should pick a future version of dbt-core / dbt-utils where we remove backwards compatibility. (Maybe utils major version 1?)

Checklist

  • I have verified that these changes work locally on the following warehouses (Note: it's okay if you do not have access to all warehouses, this helps us understand what has been covered)
    • BigQuery
    • Postgres
    • Redshift
    • Snowflake
  • I followed guidelines to ensure that my changes will work on "non-core" adapters by:
    • dispatching any new macro(s) so non-core adapters can also use them (e.g. the star() source)
    • using the limit_zero() macro in place of the literal string: limit 0
    • using dbt_utils.type_* macros instead of explicit datatypes (e.g. dbt_utils.type_timestamp() instead of TIMESTAMP
  • I have updated the README.md (if applicable)
  • I have added tests & descriptions to my models (and macros if applicable)
  • I have added an entry to CHANGELOG.md

@dbeatty10
Copy link
Contributor

I'm attracted to:

  1. Zero backwards compatibility for dbt-core < 1.2. We cut a new minor version of dbt-utils with require-dbt-version: [">=1.2.0", "<2.0.0"]

My understanding is that the footprint left within dbt-utils will be minimal but still allow dbt_utils.date_add to be utilized by downstream consumers.

My main concern is this statement:

some custom dispatch rules won't keep working after upgrading to v1.2, since we're only dispatching on older versions.

What does it mean? Is there any situation in which downstream consumers might experience a breaking change when using the next minor version of dbt-utils + dbt-core >= 1.2?

@dbeatty10 dbeatty10 mentioned this pull request May 25, 2022
9 tasks
@jtcohen6
Copy link
Contributor Author

jtcohen6 commented May 30, 2022

I'm most attracted to the hard requirement of >=1.2 as well. It significantly limits the footprint of boilerplate/duplicated code in this repo. After all, our primary goal here is to make dbt_utils more maintainable, in its future (plural) forms!

Following this approach, there will be full backwards compatibility for folks who currently call the macro as dbt_utils.dateadd. There will not be full backwards compatibility for folks who do either of these instead:

  • Call dbt_utils.default__dateadd or dbt_utils.bigquery__dateadd directly, instead of dbt_utils.dateadd
  • Define their own bigquery__dateadd macro in a separate package, install both that package and dbt_utils in their root project, and then define a dispatch config in the root project that prefers the custom package's implementation when dispatching dbt_utils.dateadd

The latter is a pretty niche case IMO. It is the model that spark-utils has followed, e.g., though actual end users of the "shim" combo (dbt-utils + spark-utils) won't have to change a thing. If users have defined their own custom bigquery__dateadd macro within their actual root project, this will just work, because dispatch for internal macros always prefers candidates from the root project if available.

@jtcohen6
Copy link
Contributor Author

jtcohen6 commented Jun 3, 2022

Closing in favor of #597

@jtcohen6 jtcohen6 closed this Jun 3, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants