Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support split BroadcastNestedLoopJoin condition for AST and non-AST [databricks] #9702

Merged
merged 4 commits into from
Nov 16, 2023

Conversation

winningsix
Copy link
Collaborator

This is to fix #8832 and #9681. This PR is based on #9635 while fixing issues in databrick runtime.

@winningsix
Copy link
Collaborator Author

build

@revans2 revans2 changed the title Support split non-AST-able join condition for BroadcastNestedLoopJoin [databricks] Support split BroadcastNestedLoopJoin condition for AST and non-AST [databricks] Nov 15, 2023
revans2
revans2 previously approved these changes Nov 15, 2023
@revans2
Copy link
Collaborator

revans2 commented Nov 15, 2023

@jlowe could you also take a look at the fix for databricks?

@winningsix
Copy link
Collaborator Author

build

@winningsix
Copy link
Collaborator Author

winningsix commented Nov 16, 2023

It seems to fail due to irrelevant cases.

For HYPOT function, is it acceptable to have some diff against CPU? It's different at 0.000 000 000 000 001e-123 against CPU @NVnavkumar

Error log
[2023-11-16T08:36:34.333Z] ----------------------------- Captured stdout call -----------------------------
[2023-11-16T08:36:34.333Z] ### CPU RUN ###
[2023-11-16T08:36:34.333Z] ### GPU RUN ###
[2023-11-16T08:36:34.333Z] ### COLLECT: GPU TOOK 0.15743589401245117 CPU TOOK 0.15597891807556152 ###
[2023-11-16T08:36:34.333Z] --- CPU OUTPUT
[2023-11-16T08:36:34.333Z] +++ GPU OUTPUT
[2023-11-16T08:36:34.333Z] @@ -162,7 +162,7 @@
[2023-11-16T08:36:34.333Z]  Row(HYPOT(a, b)=4.7649288220271566e+38)
[2023-11-16T08:36:34.333Z]  Row(HYPOT(a, b)=5.716482950764417e+60)
[2023-11-16T08:36:34.333Z]  Row(HYPOT(a, b)=9.16493876188188e+119)
[2023-11-16T08:36:34.333Z] -Row(HYPOT(a, b)=1.012833120432452e-227)
[2023-11-16T08:36:34.333Z] +Row(HYPOT(a, b)=1.0128331204324521e-227)
[2023-11-16T08:36:34.333Z]  Row(HYPOT(a, b)=1.0125195708346668e-07)
[2023-11-16T08:36:34.333Z]  Row(HYPOT(a, b)=0.011335000267386385)
[2023-11-16T08:36:34.333Z]  Row(HYPOT(a, b)=inf)
[2023-11-16T08:36:34.333Z] @@ -458,7 +458,7 @@
[2023-11-16T08:36:34.333Z]  Row(HYPOT(a, b)=inf)
[2023-11-16T08:36:34.333Z]  Row(HYPOT(a, b)=4.189982105230443e+197)
[2023-11-16T08:36:34.333Z]  Row(HYPOT(a, b)=None)
[2023-11-16T08:36:34.333Z] -Row(HYPOT(a, b)=3.422031462089623e-123)
[2023-11-16T08:36:34.333Z] +Row(HYPOT(a, b)=3.422031462089622e-123)
[2023-11-16T08:36:34.333Z]  Row(HYPOT(a, b)=3.4211840517918124e+282)
[2023-11-16T08:36:34.333Z]  Row(HYPOT(a, b)=1.253731930287053e+29)
[2023-11-16T08:36:34.333Z]  Row(HYPOT(a, b)=6.442446824252015e+255)
[2023-11-16T08:36:34.333Z] @@ -513,7 +513,7 @@
(more to show)
[2023-11-16T08:36:34.335Z] =========================== short test summary info ============================
[2023-11-16T08:36:34.335Z] FAILED ../../src/main/python/arithmetic_ops_test.py::test_hypot[Double][DATAGEN_SEED=1700118086, INJECT_OOM, APPROXIMATE_FLOAT] - AssertionError: GPU and CPU float values are different [1968, 'HYPOT(a, b)']
[2023-11-16T08:36:34.335Z] = 1 failed, 19231 passed, 2174 skipped, 620 xfailed, 278 xpassed, 78 warnings in 5706.96s (1:35:06) =

@winningsix
Copy link
Collaborator Author

build

@revans2
Copy link
Collaborator

revans2 commented Nov 16, 2023

For HYPOT function, is it acceptable to have some diff against CPU? It's different at 0.000 000 000 000 001e-123 against CPU @NVnavkumar

I think this is okay, but we should file an issue for it. It is likely caused by DATAGEN_SEED=1700118086, but we need to understand why we are off by more than APPROX_FLOAT can handle.

I'll file the issue.

@revans2
Copy link
Collaborator

revans2 commented Nov 16, 2023

I filed #9744 for this and I'll put up a PR to mark it as xfail shortly.

@winningsix winningsix merged commit 2c69f8c into NVIDIA:branch-23.12 Nov 16, 2023
37 checks passed
@sameerz sameerz added the task Work required that improves the product but is not user facing label Nov 17, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
task Work required that improves the product but is not user facing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[FEA] rewrite join conditions where only part of it can fit on the AST
4 participants