fix performance issue in scenario #2966 (part 1) #2969

Cyan4973 · 2022-01-04T08:08:37Z

Re-employing a compression state across multiple successive compression jobs
shall minimize the amount of allocation and initialization required.

This mostly matters in situations where initialization is an overwhelming task compared to compression itself.
This can happen when the amount to compress is small,
while the compression state was expecting it to be much larger,
aka, streaming mode without the benefit of a srcSize hindsight.
In which case, zstd can't rely on its automatic resource resizing capability, and initialization cost can only be made acceptable by amortizing it over a series of compressions.

This diff restores a lean-initialization optimization trick that used to be present up to v1.4.9 and was lost in commit 980f3bb .

The following measurement is taken on a core i7-9700K (turbo disabled) with fullbench -b41, using geldings.txt as sample (a small text file). The test corresponds to a scenario using ZSTD_compressStream() without the benefit of knowing the small sample size beforehand.

level	v1.5.1	this PR
1	101 MB/s	113 MB/s
2	67 MB/s	112 MB/s
3	31 MB/s	94 MB/s
4	14 MB/s	93 MB/s

Note how important is this lean-initialization optimization for small data as the compression level increases, hence the amount of resources used increases too.

Note that this PR does not completely fix #2966,
since another heavy initialization, specific to rowHash mode,
is also present (and wasn't in v1.4.9).
This second issue will be fixed in a separate commit.

For information :

level	v1.5.1	this PR	comment
4	14 MB/s	93 MB/s	`dfast`
5	9 MB/s	20 MB/s	`rowHash`
...	...	...	`rowHash`
12	0.4 MB/s	1.2 MB/s	`rowHash`
13	0.6 MB/s	38 MB/s	`btlazy2`

When re-using a compression state, across multiple successive compressions, the state should minimize the amount of allocation and initialization required. This mostly matters in situations where initialization is an overwhelming task compared to compression itself. This can happen when the amount to compress is small, while the compression state was given the impression that it would be much larger, aka, streaming mode without providing a srcSize hint. This lean-initialization optimization was broken in 980f3bb . This commit fixes it, making this scenario once again on par with v1.4.9. Note that this does not completely fix #2966, since another heavy initialization, specific to row mode, is also happening (and was not present in v1.4.9). This will be fixed in a separate commit.

felixhandte

Ok, I believe I understand the functional change.

(It would be helpful in the future to separate functional and aesthetic changes into their own commits.)

felixhandte · 2022-01-04T16:32:48Z

lib/compress/zstd_cwksp.h

-                ws->tableValidEnd = end;
+                ws->objectEnd = objectEnd;
+                ws->tableEnd = objectEnd;
+                if (ws->tableEnd > ws->tableValidEnd) ws->tableValidEnd = objectEnd;


I believe this is the single line of change that actually changes the behavior of the code and that everything else is just formatting/naming/comments. (Correct me if I'm wrong!)

I guess this makes sense.

Nit: the phrasing is a bit awkward. Normally I would expect an expression of this form to be if (a > b) b = a;, but here you have if (a > b) b = c; where it happens to be the case that a == c. I think this would be clearer as:

if (objectEnd > ws->tableValidEnd) { ws->tableValidEnd = objectEnd; }

facebook-github-bot added the CLA Signed label Jan 4, 2022

felixhandte approved these changes Jan 4, 2022

View reviewed changes

felixhandte added the Performance Speed regressions label Jan 4, 2022

Updated expression for better readability

41ad733

Cyan4973 mentioned this pull request Jan 4, 2022

[WIP] fix #2966 part 2 : do not initialize tag space #2971

Closed

terrelln approved these changes Jan 4, 2022

View reviewed changes

Cyan4973 merged commit 3c2c3fb into dev Jan 5, 2022

Cyan4973 mentioned this pull request Jan 25, 2022

Lazy parameters adaptation - buffered mode #3029

Draft

Cyan4973 deleted the fix2966_part1 branch January 13, 2023 04:28

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix performance issue in scenario #2966 (part 1) #2969

fix performance issue in scenario #2966 (part 1) #2969

Cyan4973 commented Jan 4, 2022 •

edited

Loading

felixhandte left a comment

felixhandte Jan 4, 2022

fix performance issue in scenario #2966 (part 1) #2969

fix performance issue in scenario #2966 (part 1) #2969

Conversation

Cyan4973 commented Jan 4, 2022 • edited Loading

felixhandte left a comment

Choose a reason for hiding this comment

felixhandte Jan 4, 2022

Choose a reason for hiding this comment

Cyan4973 commented Jan 4, 2022 •

edited

Loading