Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce stack usage of block splitter #2780

Merged
merged 1 commit into from
Sep 23, 2021

Conversation

senhuang42
Copy link
Contributor

@senhuang42 senhuang42 commented Sep 15, 2021

This PR moves some of the block splitter stuff that was using too much stack space

I see no speed delta for levels 16-19 (have block splitter enabled by default) on enwik7 with gcc.

contrib/linux-kernel test
old: Maximum stack size: 2760
new: Maximum stack size: 1848

Todo: Fix fuzzer errors

/* block splitter ctx */
if (ZSTD_CParams_useBlockSplitter(&params->cParams)) {
/* Silence -Wcast-align with cast to void* */
zc->blockSplitCtx = (ZSTD_blockSplitCtx*)(void*)ZSTD_cwksp_reserve_buffer(ws, sizeof(ZSTD_blockSplitCtx));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-Wcast-align has a point here. You'll need correct alignment here or certain platforms will fail.

This probably needs to go in the ZSTD_cwksp_reserve_object() space, or if that doesn't work just put it directly into the CCtx.

Copy link
Contributor Author

@senhuang42 senhuang42 Sep 22, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah you're right. I think this will just rest directly in the CCtx. If I want to save space when we're not using block splitting, a wksp object doesn't seem to work properly with the different cctx reuse cases, since objects are assumed to remain the same throughout compressions.

If I wanted to make it work somehow, it would have to be the same as the other objects in that they stay the same throughout different cctx re-uses. And if we have to do that anyways, we may as well just keep it in the cctx. Though, this does mean we will just have to eat the >1K additional heap cost.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Though, this does mean we will just have to eat the >1K additional heap cost.

That should be fine, thats much less than the total CCtx allocation. E.g. we have 500KB space for sequences, which is way overkill in most cases. So there is lower hanging fruit.

@senhuang42
Copy link
Contributor Author

Rebased on top of #2788. For review, just consider the last commit in the stack.

Copy link
Contributor

@terrelln terrelln left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me!

Before merging, you should rebase on top of dev, so you don't get any extra commits in the diff.

@senhuang42 senhuang42 merged commit c7afbec into facebook:dev Sep 23, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants