You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem or challenge? Please describe what you are trying to do.
SortPreservingMergeStream currently uses a binary heap for k-way merging, which gets O(NlogS) time complexity (S=num_streams). however, when a top record is taken from the heap, we need to perform a pop/push operation and are likely to take 2*logS comparisons.
an improved way is to use a Tournament Tree (aka Loser Tree) to do the selection. when the top record is taken, the tree structure is not modified, and only the path from bottom to top is visited. the number of comparisons is always logS.
Is your feature request related to a problem or challenge? Please describe what you are trying to do.
SortPreservingMergeStream
currently uses a binary heap for k-way merging, which getsO(NlogS)
time complexity (S=num_streams). however, when a top record is taken from the heap, we need to perform a pop/push operation and are likely to take2*logS
comparisons.an improved way is to use a Tournament Tree (aka Loser Tree) to do the selection. when the top record is taken, the tree structure is not modified, and only the path from bottom to top is visited. the number of comparisons is always
logS
.reference: https://en.wikipedia.org/wiki/K-way_merge_algorithm#Tournament_Tree
Describe the solution you'd like
implement the loser tree algorithm in
SortPreservingMergeStream
.Describe alternatives you've considered
Additional context
the benchmark shows the merging time is ~50% shorter after applying the tournament tree.
The text was updated successfully, but these errors were encountered: