-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support more expressions in equality join #4140
Comments
maybe it related to the |
Yes, it is relative to The issue is not very detailed, Let me add a little more. Currently our We can extend That means:
For #2877, since the join unit is expression, we can wrap |
Sounds like a good plan. Any expression of the form |
cc @mingmwang |
I can take a look at the issue. The major reason is
The first time calculation happens when during the
Maybe we can also take a complex approach and check the expression's complexity
logical plan
physical plan
logical plan
physical plan
|
And I just added the EquivalenceProperties/EquivalentClass to the physical execution plan, if we plan to support expressions as equal join conditions, I need to enhance the EquivalenceProperties/EquivalentClass related logic as well. |
For what it is worth, the existing join logic contains hard coded assumptions that the join is between two columns in several places. Changing the join logic (which is already complicated and will likely only get more so) is likely to be quite challenging So therefore I agree with @mingmwang's proposal of:
I don't think there would be any performance difference between casting in a Projection or in the Join itself and I think it would keep the Join significantly less complicated. I would suggest not handling casts in the Join but instead work on improving the other optimizer simplification rules to remove the casts. Like the expression
Can be rewritten into
Which we would still have to have a This type of unwrapping is already done in https://github.com/apache/arrow-datafusion/blob/master/datafusion/optimizer/src/unwrap_cast_in_comparison.rs |
Thank you @mingmwang @alamb. Agree with your suggestion to implement this with another projection. I will work on it. |
Is your feature request related to a problem or challenge? Please describe what you are trying to do.
Currently some
equality joins
which contain normalexpressions
will run ascross join
.For example:
We can move these to
hash-join
to improve performance.Describe the solution you'd like
Move these
equality joins
fromcross join
tojoin
in logical and physical plan.In addition, it also helps to fix:
Describe alternatives you've considered
Additional context
The text was updated successfully, but these errors were encountered: