Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CT-1842] [Feature] Support on_schema_change='sync_all_columns' for Delta tables #594

Open
3 tasks done
jeremyyeo opened this issue Jan 17, 2023 · 5 comments · May be fixed by #1088
Open
3 tasks done

[CT-1842] [Feature] Support on_schema_change='sync_all_columns' for Delta tables #594

jeremyyeo opened this issue Jan 17, 2023 · 5 comments · May be fixed by #1088
Labels
enhancement New feature or request help_wanted Extra attention is needed

Comments

@jeremyyeo
Copy link

jeremyyeo commented Jan 17, 2023

Is this your first time submitting a feature request?

  • I have read the expectations for open source contributors
  • I have searched the existing issues, and I could not find an existing issue for this feature
  • I am requesting a straightforward extension of existing dbt-spark functionality, rather than a Big Idea better suited to a discussion

Describe the feature

Delta Lake 2.0 supports dropping of columns on delta tables if the table has certain tblproperties (https://delta.io/blog/2022-08-29-delta-lake-drop-column/) so we may want to support the on_schema_change = 'sync_all_columns' config when columns are removed from the source (compared to the target).

image

image

Describe alternatives you've considered

A user could probably achieve this today by rewriting some of the relevant built-in macros.

Who will this benefit?

Previously a schema change (specifically a column removed from source vs target) will result in an exception:

{{ exceptions.raise_compiler_error(platform_name + ' does not support dropping columns from tables') }}

Primarily this will bring it up to par with other adapters's behaviour for on_schema_change without having users to implement their own overrides for the relevant helper macros (e.g. alter_relation_add_remove_columns() and family).

Are you interested in contributing this feature?

Sure

Anything else?

We did not support this behaviour back in #229 because delta could not drop columns - this is now supported (albeit with the necessary tblproperties applied).

Probably the way to do this is to retrieve the tblproperties of the target - and then decide whether to raise or not (or perhaps warn with a suggestion "in order to drop columns, please alter the tblproperties, ...".

We probably want this in dbt-databricks too. The behaviour in the adapter is the same as it is here.

@jeremyyeo jeremyyeo added enhancement New feature or request triage labels Jan 17, 2023
@github-actions github-actions bot changed the title [Feature] Support on_schema_change='sync_all_columns' for Delta tables [CT-1842] [Feature] Support on_schema_change='sync_all_columns' for Delta tables Jan 17, 2023
@jtcohen6
Copy link
Contributor

jtcohen6 commented Jan 18, 2023

Delta Lake 2.0 supports dropping of columns on delta tables if the table has certain tblproperties (https://delta.io/blog/2022-08-29-delta-lake-drop-column/)

Good find!

I was just converting these tests (#593), and remembering the hard way that Delta doesn't (by default) support removing columns.

I'll queue up for discussion with the relevant team. Depending on other commitments, we may want to mark this one as help_wanted.

@jtcohen6 jtcohen6 removed the triage label Jan 18, 2023
@nathaniel-may nathaniel-may added the help_wanted Extra attention is needed label Jan 19, 2023
@github-actions
Copy link
Contributor

This issue has been marked as Stale because it has been open for 180 days with no activity. If you would like the issue to remain open, please remove the stale label or comment on the issue, or it will be closed in 7 days.

@github-actions github-actions bot added the Stale label Jul 19, 2023
@data-blade
Copy link

still relevant for us

@github-actions github-actions bot removed the Stale label Jul 20, 2023
Copy link
Contributor

This issue has been marked as Stale because it has been open for 180 days with no activity. If you would like the issue to remain open, please comment on the issue or else it will be closed in 7 days.

@github-actions github-actions bot added the Stale label Jul 14, 2024
@data-blade
Copy link

still relevant for us

@github-actions github-actions bot removed the Stale label Jul 18, 2024
@Jeremynadal33 Jeremynadal33 linked a pull request Aug 9, 2024 that will close this issue
4 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request help_wanted Extra attention is needed
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants