Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce tournament tree to achieve better k-way sort-merging #4300

Closed
richox opened this issue Nov 21, 2022 · 0 comments · Fixed by #4301
Closed

Introduce tournament tree to achieve better k-way sort-merging #4300

richox opened this issue Nov 21, 2022 · 0 comments · Fixed by #4301
Labels
enhancement New feature or request

Comments

@richox
Copy link
Contributor

richox commented Nov 21, 2022

Is your feature request related to a problem or challenge? Please describe what you are trying to do.

SortPreservingMergeStream currently uses a binary heap for k-way merging, which gets O(NlogS) time complexity (S=num_streams). however, when a top record is taken from the heap, we need to perform a pop/push operation and are likely to take 2*logS comparisons.

an improved way is to use a Tournament Tree (aka Loser Tree) to do the selection. when the top record is taken, the tree structure is not modified, and only the path from bottom to top is visited. the number of comparisons is always logS.

reference: https://en.wikipedia.org/wiki/K-way_merge_algorithm#Tournament_Tree

Describe the solution you'd like
implement the loser tree algorithm in SortPreservingMergeStream.

Describe alternatives you've considered

Additional context
the benchmark shows the merging time is ~50% shorter after applying the tournament tree.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
1 participant