Improve non-GCC implementation of statistics functions Quantile/Rank #77
Labels
good first issue
Good issue for first-time contributors
lang: c++
Issues and PRs related to the C++ codebase
type: feature
Issues and PRs related to new features
Is your feature request related to a problem? Please describe.
Due to the policy-based data structures (PBDS) library being gcc specific, I had to change the implementation of Quantile and Rank in cpp/cppnodes/statsimpl.cpp to not use an order statistics tree. Instead, they use a std::multiset to store sorted values, and iterate on that set in linear time to find each quantile/rank when triggered. This is much more inefficient than the O(log n) lookup in the PBDS tree.
Describe the solution you'd like
It would be helpful if someone implemented a balanced (red-black) order statistics tree (OST) natively in csp. The OST is a binary search tree with the additional invariant that each node contains the size of its own subtree (https://en.wikipedia.org/wiki/Order_statistic_tree). This allows for efficient (log n time) insert, find, delete, and index-based lookup.
Describe alternatives you've considered
There are actually 2 data structures that will achieve O(log n) insert, delete and index-based lookup
Additional context
This would be helpful for any non gcc users and would also make the code more readable since we're no longer using 2 different approaches for the same stats calculation.
The text was updated successfully, but these errors were encountered: