Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Minor: support complex expr as the arg in the ApproxPercentileCont function #8580

Merged
merged 2 commits into from
Dec 20, 2023

Conversation

liukun4515
Copy link
Contributor

Which issue does this PR close?

Closes #.

Rationale for this change

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

@github-actions github-actions bot added physical-expr Physical Expressions core Core DataFusion crate labels Dec 19, 2023
@liukun4515 liukun4515 changed the title support complex expr as the arg in the ApproxPercentileCont function Minor: support complex expr as the arg in the ApproxPercentileCont function Dec 19, 2023
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @liukun4515 -- this PR seems to fix the problem and has a test so 👍

This doesn't seem like it is a problem that is specific to approx_percentile_cont -- there might be a way to make it more general but we could also always do that as a follow on PR

@@ -186,6 +187,25 @@ async fn test_fn_approx_percentile_cont() -> Result<()> {

assert_batches_eq!(expected, &batches);

// the arg2 parameter is a complex expr, but it can be evaluated to the literal value
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this may be related to aliasing rather than simply complex expressions

❯ create table t (x int) as values (1), (2);
0 rows in set. Query took 0.001 seconds.

❯ select APPROX_PERCENTILE_CONT(x,  cast(1 as double)) from t;
+--------------------------------------+
| APPROX_PERCENTILE_CONT(t.x,Int64(1)) |
+--------------------------------------+
| 2                                    |
+--------------------------------------+
1 row in set. Query took 0.002 seconds.

However, I couldn't express the same alias using SQL.

I wonder if the expression simplifier code could be generalized to handle this case, so it applied to all arguments, not just the arguments to APPROX_PERCENTILE_CONT 🤔

Copy link
Contributor Author

@liukun4515 liukun4515 Dec 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, the rule of simplfy_expression can not handle the alisa expr.
But the cast(1 as float) will be simplified.
I think this is the diff between the cast and alisa

@liukun4515
Copy link
Contributor Author

Thanks @liukun4515 -- this PR seems to fix the problem and has a test so 👍

This doesn't seem like it is a problem that is specific to approx_percentile_cont -- there might be a way to make it more general but we could also always do that as a follow on PR

Yes this pr is a quick fix for the issue.

We can enchancement the rule of simplified_expression to get this.

@liukun4515 liukun4515 merged commit 1bcaac4 into apache:main Dec 20, 2023
22 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core Core DataFusion crate physical-expr Physical Expressions
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants