Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve zstd_opt build speed and size #2898

Merged
merged 1 commit into from
Dec 3, 2021
Merged

Conversation

terrelln
Copy link
Contributor

@terrelln terrelln commented Dec 2, 2021

Use the same trick as we did for zstd_lazy in PR #2828:

  • Create one search function specialization for each (dictMode, mls).
  • Select the search function pointer at the top of the match finder.

Additionally, we no longer inline ZSTD_compressBlock_opt_generic into
every function, since dictMode is no longer used as a template. Create
two specializations, for opt levels 0 and 2, and call one of the two
specializations.

Lastly, remove the hack that disabled inlining for zstd_opt for the
Linux Kernel, as we've gotten most of the benefit already.

Compilation time sees a ~4x reduction:

Compiler Flags Dev Time (s) PR Time (s) Delta
gcc -O3 10.1 2.3 -77%
gcc -O3 -fsanitize=address,undefined 61.1 10.2 -83%
clang -O3 9.0 2.1 -76%
clang -O3 -fsanitize=address,undefined 33.5 5.1 -84%

Build size is reduced by 150KB - 200KB:

Compiler Dev libzstd.a Size (B) PR libzstd.a Size (B) Delta
gcc 1327476 1177108 -11%
clang 1378324 1167780 -15%

There is a <2% speed loss in all cases:

Compiler Level Dev Speed (MB/s) PR Speed (MB/s) Delta
gcc 16 4.78 4.72 -1.25%
gcc 17 3.49 3.46 -0.85%
gcc 18 2.92 2.86 -2.04%
gcc 19 2.61 2.61 0.00%
clang 16 4.69 4.80 2.34%
clang 17 3.53 3.49 -1.13%
clang 18 2.86 2.85 -0.34%
clang 19 2.61 2.61 0.00%

Fixes Issue #2862.

@Cyan4973
Copy link
Contributor

Cyan4973 commented Dec 2, 2021

Great work ! Nice build time and binary size savings !

Use the same trick as we did for zstd_lazy in PR facebook#2828:
* Create one search function specialization for each (dictMode, mls).
* Select the search function pointer at the top of the match finder.

Additionally, we no longer inline `ZSTD_compressBlock_opt_generic` into
every function, since `dictMode` is no longer used as a template. Create
two specializations, for opt levels 0 and 2, and call one of the two
specializations.

Lastly, remove the hack that disabled inlining for zstd_opt for the
Linux Kernel, as we've gotten most of the benefit already.

Compilation time sees a ~4x reduction:

| Compiler | Flags                            | Dev Time (s) | PR Time (s) | Delta |
|----------|----------------------------------|--------------|-------------|-------|
| gcc      | -O3                              |         10.1 |         2.3 |  -77% |
| gcc      | -O3 -fsanitize=address,undefined |         61.1 |        10.2 |  -83% |
| clang    | -O3                              |          9.0 |         2.1 |  -76% |
| clang    | -O3 -fsanitize=address,undefined |         33.5 |         5.1 |  -84% |

Build size is reduced by 150KB - 200KB:

| Compiler | Dev libzstd.a Size (B) | PR libzstd.a Size (B) | Delta |
|----------|------------------------|-----------------------|-------|
| gcc      |                1327476 |               1177108 |  -11% |
| clang    |                1378324 |               1167780 |  -15% |

There is a <2% speed loss in all cases:

| Compiler | Level | Dev Speed (MB/s) | PR Speed (MB/s) | Delta  |
|----------|-------|------------------|-----------------|--------|
| gcc      |    16 |             4.78 |            4.72 | -1.25% |
| gcc      |    17 |             3.49 |            3.46 | -0.85% |
| gcc      |    18 |             2.92 |            2.86 | -2.04% |
| gcc      |    19 |             2.61 |            2.61 |  0.00% |
| clang    |    16 |             4.69 |            4.80 |  2.34% |
| clang    |    17 |             3.53 |            3.49 | -1.13% |
| clang    |    18 |             2.86 |            2.85 | -0.34% |
| clang    |    19 |             2.61 |            2.61 |  0.00% |

Fixes Issue facebook#2862.
@terrelln terrelln merged commit 014bbb2 into facebook:dev Dec 3, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants