Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add try_optimize() for all rules. #4599

Merged
merged 1 commit into from
Dec 14, 2022
Merged

Conversation

jackwener
Copy link
Member

Which issue does this PR close?

Closes #4598.
Followup #4208

Rationale for this change

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice -- thank you @jackwener 🏅

I wonder if after this change we should perhaps simply remove OptimizerRule::optimize to simply the code 🤔

@@ -107,7 +111,7 @@ impl OptimizerRule for DecorrelateWhereExists {
}

// iterate through all exists clauses in predicate, turning each into a join
let mut cur_input = (**filter_input).clone();
let mut cur_input = filter_input.clone();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❤️

@@ -40,27 +40,45 @@ impl OptimizerRule for EliminateLimit {
fn optimize(
&self,
plan: &LogicalPlan,
optimizer_config: &mut OptimizerConfig,
_optimizer_config: &mut OptimizerConfig,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find it strange to use a variable named _something -- I thought the Rust convention was to name variables starting with _ when they were not used 🤔

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

resolved.
BTW, in the followup PR, I will unify them to make all of them become optimizer_config if possible.

@alamb
Copy link
Contributor

alamb commented Dec 13, 2022

cc @andygrove

@jackwener
Copy link
Member Author

jackwener commented Dec 14, 2022

I wonder if after this change we should perhaps simply remove OptimizerRule::optimize to simply the code 🤔

I prepare to do it in followup-PR

@jackwener
Copy link
Member Author

Followup PR

  • remove all optimize()
  • avoid every rule must recursive children in optimizer

optimize_internal(&DFSchema::empty(), plan)
Ok(self
.try_optimize(plan, optimizer_config)?
.unwrap_or_else(|| plan.clone()))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've stumbled upon this while working on #4614. I think this is wrong: the type coercion MUST NOT be skipped in case of an error, otherwise we may end up with non-executable plans.

Copy link
Member Author

@jackwener jackwener Dec 14, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR don't change any process logic.
I am not sure about skipped you said.
Current try_optimize alway is some, or maybe I can replace .unwrap_or_else(|| plan.clone())) with unwarp()

And it will be deleted in next PR, so I think it doesn't matter

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think @crepererum is saying that if the type coercion rules fail internally (return an Error) it is likely a serious bug in DataFusion and the plan will not work as expected.

By effectively ignoring the error here, the error will appear at some later stage (e.g. can't run the plan), making it harder to debug the source of the issue

🤔

Copy link
Member Author

@jackwener jackwener Dec 14, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get it, there is some misunderstand here.
I felt like this comment isn't related with this PR. So I was a little confused about it.

Here I don't ignore error.

            self
            .try_optimize(plan, optimizer_config)?
            .unwrap_or_else(|| plan.clone())

error will be return by ?

unwrap_or_else is used for option<>

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, let's move the discussion / fix to a follow-up: #4615.

@jackwener
Copy link
Member Author

jackwener commented Dec 14, 2022

#4618 #4619 rely on this PR.
Especially for #4618, I think it's good improvement, and it also is one thing that I wanted to do half a year ago🚀.

Copy link
Member

@andygrove andygrove left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this @jackwener. LGTM.

@Dandandan Dandandan merged commit 508ba80 into apache:master Dec 14, 2022
@Dandandan
Copy link
Contributor

Thank you @jackwener 😎

@ursabot
Copy link

ursabot commented Dec 14, 2022

Benchmark runs are scheduled for baseline = 0baf5ef and contender = 508ba80. 508ba80 is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Skipped ⚠️ Benchmarking of arrow-datafusion-commits is not supported on ec2-t3-xlarge-us-east-2] ec2-t3-xlarge-us-east-2
[Skipped ⚠️ Benchmarking of arrow-datafusion-commits is not supported on test-mac-arm] test-mac-arm
[Skipped ⚠️ Benchmarking of arrow-datafusion-commits is not supported on ursa-i9-9960x] ursa-i9-9960x
[Skipped ⚠️ Benchmarking of arrow-datafusion-commits is not supported on ursa-thinkcentre-m75q] ursa-thinkcentre-m75q
Buildkite builds:
Supported benchmarks:
ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
test-mac-arm: Supported benchmark langs: C++, Python, R
ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java

@jackwener jackwener deleted the try_optimize branch December 25, 2022 14:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
optimizer Optimizer rules
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add try_optimize for all_rules
6 participants