Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SortPreservingMerge Special Case Single Column #5882

Closed
tustvold opened this issue Apr 5, 2023 · 1 comment
Closed

SortPreservingMerge Special Case Single Column #5882

tustvold opened this issue Apr 5, 2023 · 1 comment
Assignees
Labels
enhancement New feature or request

Comments

@tustvold
Copy link
Contributor

tustvold commented Apr 5, 2023

Is your feature request related to a problem or challenge?

SortPreservingMerge currently always uses the arrow row format, this provides compelling benefits when sorting by a tuple of multiple columns. However, it is often the case that a sort is being performed on a single column. lexsort_to_indices which is used by SortExec detects this case and calls through to sort_to_indices which has specialised implementations for each column type. This significantly outperforms converting to the row format as it allows for fixed width comparisons, which are significantly faster

Describe the solution you'd like

I would like to be able to use a specialized, fixed width sort cursor within SortPreservingMerge

Describe alternatives you've considered

No response

Additional context

#5854 contained a POC implementation of this

@tustvold tustvold added the enhancement New feature or request label Apr 5, 2023
@tustvold tustvold self-assigned this Apr 5, 2023
tustvold added a commit to tustvold/arrow-datafusion that referenced this issue Apr 5, 2023
tustvold added a commit to tustvold/arrow-datafusion that referenced this issue Apr 5, 2023
@alamb
Copy link
Contributor

alamb commented Apr 5, 2023

cc @jaylmiller

tustvold added a commit to tustvold/arrow-datafusion that referenced this issue Apr 6, 2023
tustvold added a commit that referenced this issue Apr 7, 2023
* Generify SortPreservingMerge (#5882) (#5879)

* Review feedback
tustvold added a commit to tustvold/arrow-datafusion that referenced this issue Apr 7, 2023
tustvold added a commit that referenced this issue Apr 11, 2023
…ive column faster (#5897)

* Specialize PrimitiveCursor (#5882)

* Toml format

* Review feedback
korowa pushed a commit to korowa/arrow-datafusion that referenced this issue Apr 13, 2023
…ive column faster (apache#5897)

* Specialize PrimitiveCursor (apache#5882)

* Toml format

* Review feedback
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants