Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FLINK-36092] Fix schema evolution with wildcarded transform rules #3557

Merged
merged 5 commits into from
Aug 22, 2024

Conversation

yuxiqian
Copy link
Contributor

@yuxiqian yuxiqian commented Aug 19, 2024

Currently, transform doesn't work very well with schema evolution.

The main issue is that TransformOperator (including Pre- and Post-) simply passes all upstream schema change events to downstream, which isn't correct under most cases.

Here's corresponding operations this PR tries to do in various transform rule definitions:

*, ... ..., *, ... ..., * No asterisk
AddColumnEvent Yes, but LAST position needs reordering (LAST -> AFTER) Yes, but FIRST / LAST position needs reordering Yes, but FIRST position needs reordering No
AlterColumnTypeEvent Yes Yes Yes Maybe, if that column was referenced
RenameColumnEvent Yes Yes Yes No, exception if that column was referenced
DropColumnEvent Yes Yes Yes No, exception if that column was referenced

@yuxiqian
Copy link
Contributor Author

Considering this is blocking users from using schema evolution with transform blocks, could @leonardBang @aiwenmo please take a look?

Copy link
Contributor

@leonardBang leonardBang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @yuxiqian for the fix, the change generally looks good to me except one comment

@yuxiqian yuxiqian changed the title [FLINK-36092] Let schema evolve work with transform [FLINK-36092] Fix schema evolution with wildcarded transform rules Aug 20, 2024
# Conflicts:
#	flink-cdc-runtime/src/main/java/org/apache/flink/cdc/runtime/operators/transform/PreTransformOperator.java
This was caused by late initialization of `transforms` blocks. `open` isn't early enough since it won't be executed until `initializeState` phase. According to Flink docs, putting data fields initialization phase in `setup` should be suitable.
@leonardBang
Copy link
Contributor

@yuxiqian The CI failed

@yuxiqian
Copy link
Contributor Author

Seems Postgres and OceanBase CI is hanging, unlikely to be relevant to this PR.

Copy link
Contributor

@leonardBang leonardBang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CI passed after retry, merging...

@leonardBang leonardBang merged commit 3837887 into apache:master Aug 22, 2024
21 checks passed
yuxiqian added a commit to yuxiqian/flink-cdc that referenced this pull request Aug 22, 2024
…ed transform rule

This closes apache#3557.

(cherry picked from commit 3837887)
leonardBang pushed a commit that referenced this pull request Aug 27, 2024
leonardBang pushed a commit to yuxiqian/flink-cdc that referenced this pull request Aug 27, 2024
…ed transform rule

This closes apache#3557.

(cherry picked from commit 3837887)
leonardBang pushed a commit that referenced this pull request Aug 27, 2024
…ed transform rule

This closes #3557.

(cherry picked from commit 3837887)
qiaozongmi pushed a commit to qiaozongmi/flink-cdc that referenced this pull request Sep 23, 2024
qiaozongmi pushed a commit to qiaozongmi/flink-cdc that referenced this pull request Sep 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants