Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make internal macros use macro dispatch pattern #72

Merged
merged 9 commits into from
Apr 11, 2022

Conversation

ueshin
Copy link
Collaborator

@ueshin ueshin commented Apr 1, 2022

Description

Makes internal macros use macro dispatch pattern for backward compatibility for users who override the equivalent macros in their projects.

@jtcohen6
Copy link
Contributor

jtcohen6 commented Apr 3, 2022

In cases where these macros are line-for-line identical with their counterparts in dbt-spark, I don't think you actually need to redefine them here. You can still call them within other macros in this plugin, including in materializations. The reason: All macros defined in plugins and in dbt-core's global project are saved in the "internal"/"global" dbt namespace.

In cases where you do need to redefine these macros—e.g. file_format_clause should return delta by default rather than parquet—we should use explicit macro dispatch to deterministically define inheritance order. Docs: https://docs.getdbt.com/reference/dbt-jinja-functions/dispatch

To that end, I think the right solution is to add dispatch calls and spark__ prefixes in dbt-spark, so that you can then add databricks__ prefixes in dbt-databricks. To simplify, the preference order is:

  1. Macros in the user's own project — they can always override the builtins, and they can do it by defining a macro named any of these: file_format_clause, databricks__file_format_clause, spark__file_format_clause, default__file_format_clause
  2. Built-in macros prefixed databricks__: your version, databricks__file_format_clause, delta by default
  3. Built-in macros prefixed spark__: our version, spark__file_format_clause (should probably be parquet by default rather than nothing/text!)
  4. Finally, macros prefixed default__, i.e. if file_format_clause were to be a macro built into dbt-core with a sensible default, in the same manner as create_table_as and get_merge_sql

Could I ask you to open that as an issue/PR in dbt-spark? If the CLA is a blocker to contributing actual code, let's talk more and find a way to make that not so

dev_requirements.txt Outdated Show resolved Hide resolved
@ueshin ueshin changed the title Fix macro names to use the same names as dbt-spark. Make internal macros use macro dispatch pattern Apr 5, 2022
@ueshin ueshin marked this pull request as ready for review April 6, 2022 17:27
Copy link
Collaborator

@allisonwang-db allisonwang-db left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

@@ -1,19 +1,11 @@
{% macro dbt_databricks_file_format_clause() %}
{% macro databricks__file_format_clause() %}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just curious, if a user has re-defined the file_format_clause in an existing project, for example

{% macro file_format_clause() %}
...
{% endmacro %}

Will this change (and the upstream change in dbt-spark) break the existing pipeline when the user upgrades the adapter version?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to @jtcohen6's explanation in #72 (comment), the user's file_format_clause will be used.

  1. Macros in the user's own project — they can always override the builtins, and they can do it by defining a macro named any of these: file_format_clause, databricks__file_format_clause, spark__file_format_clause, default__file_format_clause

-r{toxinidir}/dev_requirements.txt
-r{toxinidir}/requirements.txt
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is needed because we need to install the latest dbt-spark version first?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes.

@ueshin
Copy link
Collaborator Author

ueshin commented Apr 11, 2022

Thanks! merging.

@ueshin ueshin merged commit a760dd2 into databricks:main Apr 11, 2022
@ueshin ueshin deleted the macros branch April 11, 2022 18:11
ueshin added a commit that referenced this pull request Apr 11, 2022
### Description

Backport of #72.

Makes internal macros use macro dispatch pattern for backward compatibility for users who override the equivalent macros in their projects.
The old name macros are still available during 1.0.x releases, but will not be available in 1.1.0 release.

This fix will be available with `dbt-spark>=1.0.1`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants