Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix performance issue in scenario #2966 (part 1) #2969

Merged
merged 2 commits into from
Jan 5, 2022
Merged

Conversation

Cyan4973
Copy link
Contributor

@Cyan4973 Cyan4973 commented Jan 4, 2022

Re-employing a compression state across multiple successive compression jobs
shall minimize the amount of allocation and initialization required.

This mostly matters in situations where initialization is an overwhelming task compared to compression itself.
This can happen when the amount to compress is small,
while the compression state was expecting it to be much larger,
aka, streaming mode without the benefit of a srcSize hindsight.
In which case, zstd can't rely on its automatic resource resizing capability, and initialization cost can only be made acceptable by amortizing it over a series of compressions.

This diff restores a lean-initialization optimization trick that used to be present up to v1.4.9 and was lost in commit 980f3bb .

The following measurement is taken on a core i7-9700K (turbo disabled) with fullbench -b41, using geldings.txt as sample (a small text file). The test corresponds to a scenario using ZSTD_compressStream() without the benefit of knowing the small sample size beforehand.

level v1.5.1 this PR comment
1 101 MB/s 113 MB/s
2 67 MB/s 112 MB/s
3 31 MB/s 94 MB/s
4 14 MB/s 93 MB/s

Note how important is this lean-initialization optimization for small data as the compression level increases, hence the amount of resources used increases too.

Note that this PR does not completely fix #2966,
since another heavy initialization, specific to rowHash mode,
is also present (and wasn't in v1.4.9).
This second issue will be fixed in a separate commit.

For information :

level v1.5.1 this PR comment
4 14 MB/s 93 MB/s dfast
5 9 MB/s 20 MB/s rowHash
... ... ... rowHash
12 0.4 MB/s 1.2 MB/s rowHash
13 0.6 MB/s 38 MB/s btlazy2

When re-using a compression state, across multiple successive compressions,
the state should minimize the amount of allocation and initialization required.

This mostly matters in situations where initialization is an overwhelming task
compared to compression itself.
This can happen when the amount to compress is small,
while the compression state was given the impression that it would be much larger,
aka, streaming mode without providing a srcSize hint.

This lean-initialization optimization was broken in 980f3bb .

This commit fixes it, making this scenario once again on par with v1.4.9.

Note that this does not completely fix #2966,
since another heavy initialization, specific to row mode,
is also happening (and was not present in v1.4.9).
This will be fixed in a separate commit.
Copy link
Contributor

@felixhandte felixhandte left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I believe I understand the functional change.

(It would be helpful in the future to separate functional and aesthetic changes into their own commits.)

ws->tableValidEnd = end;
ws->objectEnd = objectEnd;
ws->tableEnd = objectEnd;
if (ws->tableEnd > ws->tableValidEnd) ws->tableValidEnd = objectEnd;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this is the single line of change that actually changes the behavior of the code and that everything else is just formatting/naming/comments. (Correct me if I'm wrong!)

I guess this makes sense.

Nit: the phrasing is a bit awkward. Normally I would expect an expression of this form to be if (a > b) b = a;, but here you have if (a > b) b = c; where it happens to be the case that a == c. I think this would be clearer as:

if (objectEnd > ws->tableValidEnd) {
  ws->tableValidEnd = objectEnd;
}

@felixhandte felixhandte added the Performance Speed regressions label Jan 4, 2022
@Cyan4973 Cyan4973 merged commit 3c2c3fb into dev Jan 5, 2022
@Cyan4973 Cyan4973 deleted the fix2966_part1 branch January 13, 2023 04:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed Performance Speed regressions
Projects
None yet
Development

Successfully merging this pull request may close these issues.

MUCH slower compression speeds using version 1.5.1
4 participants