-
Notifications
You must be signed in to change notification settings - Fork 231
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support batching for RANGE
running window aggregations. Including on [databricks]
#9544
Support batching for RANGE
running window aggregations. Including on [databricks]
#9544
Conversation
This commit adds support for `FIRST()` window functions, running in a ROWS context. This does not currently support ignore/include nulls. Signed-off-by: MithunR <[email protected]>
Plus, tests for partitioned/unpartitioned, and keep/ignore nulls.
Also removed unused import.
Signed-off-by: MithunR <[email protected]>
GpuAlias covers the isRangeFrame check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good.
Signed-off-by: MithunR <[email protected]>
Build |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general it looks really good. Just want to see if there is something we can do with the testing for range windows. Probably is fine as is, but I would love it if we could get some coverage.
This reverts commit 22aaf7c. Best not combine the running window ROW/RANGE tests.
build |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me.
Signed-off-by: MithunR <[email protected]>
Thanks for the review, @revans2. I'm adding a couple of similar new tests for the partitioned cases. |
Signed-off-by: MithunR <[email protected]>
RANGE
running window aggregationsRANGE
running window aggregations
Build |
RANGE
running window aggregationsRANGE
running window aggregations. Including on [databricks]
Build |
This is a followup to #9489, but applies to more than
FIRST()
.GpuRunningWindowExec
handles "running-window" aggregations, i.e.[UNBOUNDED PRECEDING, CURRENT ROW
, as long as they areROW
based window specifications.But when the window spec isn't specified, the default is to use
RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
:This causes the window aggregation to be done with a single batch, with an increased likelihood of OOMs.
This commit adds batching support for
RANGE
queries by including the order-by column in batching. This allows the rows with the same values of partition-keys and order-by-keys to be processed as part of the same batch. This allows us to avoid crossing batches for fix-ups.