Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf(linter): use binary_search instead of contains #4446

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

togami2864
Copy link
Contributor

@togami2864 togami2864 commented Nov 1, 2024

Summary

Replace all .contains with .binsearch.

Test Plan

Added tests to ensure values are sorted.

@togami2864 togami2864 self-assigned this Nov 1, 2024
@github-actions github-actions bot added A-Linter Area: linter L-CSS Language: CSS labels Nov 1, 2024
Copy link

codspeed-hq bot commented Nov 1, 2024

CodSpeed Performance Report

Merging #4446 will not alter performance

Comparing togami2864:perf/binsearch (9a55fc8) with main (f38694c)

Summary

✅ 99 untouched benchmarks

Copy link
Contributor

@arendjr arendjr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we enforce this at the type-level somehow? The obvious solution would be to use a BtreeSet instead. I’m a bit afraid someone would add an entry and overlook the fact they need to be ordered, and we’d have a bug.

@Conaclos
Copy link
Member

Conaclos commented Nov 2, 2024

Can we enforce this at the type-level somehow? The obvious solution would be to use a BtreeSet instead. I’m a bit afraid someone would add an entry and overlook the fact they need to be ordered, and we’d have a bug.

We could add unit tests as we did for the sorted arrays of JS builtins.

@arendjr
Copy link
Contributor

arendjr commented Nov 2, 2024

Adding tests would be another approach indeed, but do we know the benefit of this approach to begin with? If there are no benchmarks, we're just complicating trivial functionality for unclear gain.

It could even very well be that we're making things slower with this approach: https://www.reddit.com/r/rust/comments/1anlbui/comment/kpxl77q/

@Conaclos
Copy link
Member

Conaclos commented Nov 2, 2024

It could even very well be that we're making things slower with this approach: https://www.reddit.com/r/rust/comments/1anlbui/comment/kpxl77q/

Yes, indeed, For small arrays, linear search is always the fastest approach. It is unclear to me what "small" is. The link you shared suggests 100 items. However, this also depends on the complexity of the comparison function that is really cheap for integers and a bit more complex for strings.
Moreover, Rust 1.82 (released in October 2024) introduced a rewrite of the binary search implementation (See the associated changelog entry and PR). It should now be faster than in the previous versions. Notably, when LLVM is able to determine the slice length (that is likely to happen in the current use case), it generates a compact branch-less code. Thus, I am unsure whether it makes a real difference of using one or the other in our use case.
Another approach is using a perfect hash functions. If I remember correctly it is also slower than a linear search for small arrays of strings.

I tend to use linear search for very small arrays (8 items or fewer). We could evaluate if the number should be higher. For quite-small arrays binary search should be good enough.

By the way, if we write tests to check order we could use the recently stabilized is_sorted method.

@togami2864 togami2864 marked this pull request as ready for review November 14, 2024 11:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-Linter Area: linter L-CSS Language: CSS
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants