Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Spark 3.4: Push down system functions by V2 filters for rewriting DataFiles and PositionDeleteFiles #8560
Spark 3.4: Push down system functions by V2 filters for rewriting DataFiles and PositionDeleteFiles #8560
Changes from 1 commit
f791682
802dfd1
e8ca42f
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the purpose here? It seems to just copy from L510-L513.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this meant to be a bucket transform call?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah good catch. Initially I was about to add filter here for bucket transform (forgot to change) but I end up create a new method to test all V2Filters can be evaluated without exception.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not related to this PR either. The
get
here indeed is not a good idea, because the expression could fail to translate and the error message is not valuable. I have opened #8394 for it.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there anything we have to worry about here in moving from SparkFilters to SparkV2Filters?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thank you @ConeyLiu , I will rebase my change if 8394 gets merged first.
Also added some comments where we are now convert Spark catalyst expression to Predicate instead of spark source filter. I ran all the unit tests to make sure old filter are working as expected.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to use the test utils class
SystemFunctionPushDownHelper
to build the table and data?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you Coney, your test utils class is super helpful. However I realized this SparkAction tests was assumed to use hadoop catalog so the table creation is a bit different as it's by table location https://github.com/apache/iceberg/blob/master/spark/v3.4/spark/src/test/java/org/apache/iceberg/spark/actions/TestRewriteDataFilesAction.java#L1563. But I opted to use your
SystemFunctionPushDownHelper
in TestRewriteDataFilesProcedure.