Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revert opt outlining change #2778

Merged
merged 1 commit into from
Sep 15, 2021
Merged

Conversation

senhuang42
Copy link
Contributor

#2763 regressed performance of opt by about ~7%.

There are 5 inlined functions in opt. I've re-tested the performance with and without inlining of each one separately.

The only function that can be outlined without any obvious regression on gcc is ZSTD_getMatchPrice(). But even then, it saves a negligible 4KB of binary size and no discernible compile speed difference, and the speed changes are not obviously better.

benched on level 16 silesia.tar, gcc-11
All inlined: 5.90 MB/s
ZSTD_compressBlock_opt_generic outlined: 5.82 MB/s
ZSTD_BtGetAllMatches outlined: 5.79 MB/s
ZSTD_insertBtAndGetAllMatches outlined: 5.78 MB/s
ZSTD_updateTree_internal outlined: 5.82 MB/s
ZSTD_getMatchPrice outlined: 5.86 MB/s (equal performance for level 17-19 though, and on enwik7).

Seeing this, it doesn't seem worth it to have outlined these functions. Cumulatively, the effect of outlining all of them saves a lot on binary size and compile speed, but the speed trade-off is too severe.

@senhuang42 senhuang42 merged commit 9d2a45a into facebook:dev Sep 15, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants