Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[lazy] Speed up compilation times #2828

Merged
merged 1 commit into from
Oct 25, 2021
Merged

Commits on Oct 22, 2021

  1. [lazy] Speed up compilation times

    Speed up compilation times by moving each specialized search function
    into its own function. This is faster because compilers can handle many
    smaller functions much faster than one gigantic function. The previous
    approach generated one giant function with `switch` statements and
    inlining to select the implementation.
    
    | Compiler | Flags                               | Dev Time (s) | PR Time (s) | Delta |
    |----------|-------------------------------------|--------------|-------------|-------|
    | gcc      | -O3                                 |         16.5 |         5.6 |  -66% |
    | gcc      | -O3 -g -fsanitize=address,undefined |        158.9 |        38.2 |  -75% |
    | clang    | -O3                                 |         36.5 |         5.5 |  -85% |
    | clang    | -O3 -g -fsanitize=address,undefined |         27.8 |        17.5 |  -37% |
    
    This also reduces the binary size because the search functions are no
    longer inlined into the main body.
    
    | Compiler | Dev libzstd.a Size (B) | PR libzstd.a Size (B) | Delta |
    |----------|------------------------|-----------------------|-------|
    | gcc      |                1563868 |               1308844 |  -16% |
    | clang    |                1924372 |               1376020 |  -28% |
    
    Finally, the performance is not impacted significantly by this change,
    in fact we generally see a small speed boost.
    
    | Compiler | Level | Dev Speed (MB/s) | PR Speed (MB/s) | Delta |
    |----------|-------|------------------|-----------------|-------|
    | gcc      |     5 |            110.6 |           110.0 | -0.5% |
    | gcc      |     7 |             70.4 |            72.2 | +2.5% |
    | gcc      |     9 |             53.2 |            53.5 | +0.5% |
    | gcc      |    13 |             12.7 |            12.9 | +1.5% |
    | clang    |     5 |            113.9 |           110.4 | -3.0% |
    | clang    |     7 |             67.7 |            70.6 | +4.2% |
    | clang    |     9 |             51.9 |            52.2 | +0.5% |
    | clang    |    13 |             12.4 |            13.3 | +7.2% |
    
    The compression strategy is unmodified in this PR, so the compressed size
    should be exactly the same. I may have a follow up PR to slightly improve
    the compression ratio, if it doesn't cost too much speed.
    terrelln committed Oct 22, 2021
    Configuration menu
    Copy the full SHA
    13cad3a View commit details
    Browse the repository at this point in the history