Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test Refactoring HashMaps with HashBrown #655

Closed
neeldug opened this issue Aug 22, 2020 · 11 comments
Closed

Test Refactoring HashMaps with HashBrown #655

neeldug opened this issue Aug 22, 2020 · 11 comments

Comments

@neeldug
Copy link
Contributor

neeldug commented Aug 22, 2020

Currently, Boa uses rustc-hash for all HashMaps, from looking at the documentation, there are no specific benchmarks on this hashing algorithm, and so I propose we test hashbrown, which is Rust's port for Google's SwissTable hash map, I figure that if we find any drastic performance benefits, it may be worth considering switching, as hashbrown can be used as a drop-in replacement for Rust's STL HashMap Collection. I believe testing this on a branch and viewing the benchmarks may show whether the HashMap implementation is causing a bottleneck.

EDIT: Seems similar to #356 which actually raised using rustc-hash for the very same reason, perhaps useful to see which performs better.

@Razican
Copy link
Member

Razican commented Aug 22, 2020

Hi, as far as I understand, Rust's standard library uses hashbrown under the hood since 2018 or so, am I correct?

@jasonwilliams
Copy link
Member

Yep, the Rust standard hashmap is already the hashbrown implementation https://blog.rust-lang.org/2019/07/04/Rust-1.36.0.html

@neeldug
Copy link
Contributor Author

neeldug commented Aug 22, 2020

Ahh ok, my bad, so was the performance enhancement for rustc-hash measurable? Also the hasher could be changed to use ahash or some other performant hasher.

@jasonwilliams
Copy link
Member

@neeldug you can see the last issue where we changed the hashmap implementation here #356

@neeldug
Copy link
Contributor Author

neeldug commented Aug 22, 2020

Yeah @jasonwilliams although at the time, there was no benchmarking done to see any quantifiable performance improvement, just was suggesting testing and seeing if there are any significant benchmarking improvements. Also specifically using the ahash hasher rather than the default SIP.

@jasonwilliams
Copy link
Member

jasonwilliams commented Aug 22, 2020

@neeldug my recommendation would be to make a new issue for that and if you can do find or do benchmarking to make a case that would be useful.

there was no benchmarking done to see any quantifiable performance improvement

#368 (comment)
#368 (comment)

@neeldug
Copy link
Contributor Author

neeldug commented Aug 22, 2020

Yeah, I saw this, but I’m unsure of whether previously the hashmaps were using ahash or SIP, if they were using ahash, then sure the benchmarks for sure are correct.

I’ll work on setting up an independent benchmark between the two.

@neeldug
Copy link
Contributor Author

neeldug commented Aug 23, 2020

@jasonwilliams did some testing which showed that the STL using ahash vs rustc-hash made negligible performance differences for the most part except for scaling with usize usize hashmaps, this made major differences with ahash outperforming rustc-hash for inserting and deleting large quantities of usize pairs into the hashmap. I believe that unless this operation is common and at large-scale, it's not worth switching.

Testing done here, tried integrating CI benching, however, I've heard that it's not the most accurate testing method.

@Razican
Copy link
Member

Razican commented Aug 24, 2020

Just to give some extra information here, we are not using the default hashing algorithm, we are using FXHash, which is noticeably faster.

@neeldug
Copy link
Contributor Author

neeldug commented Aug 24, 2020

Actually, @Razican just wondering whether these benchmarks, performed a year ago, apply to Boa’s current state, as the benchmarks that I performed were quite naive in implementation due to lack of knowledge and reliability across machines.

Prior to the original HashMap switch, the std collection hasher was using SIP, so the original performance comparison can not be used to compare.

Also, in doing some more research both AHash and FxHash are extremely fast, however, based on use case one generally can outperform the other, so I suggest that we test a specific parallel branch to master to keep track of whichever Hasher proves to be fasted pendant on current Hashmap usage. If this is ok, then I can create a draft PR and see how the benches workout comparatively.

@Razican
Copy link
Member

Razican commented Aug 25, 2020

Feel free to make a PR. If it's more performant, we can replace our implementation :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants