Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[bug] Fix entropy repeat mode bug #2697

Merged
merged 1 commit into from
Jun 10, 2021

Conversation

senhuang42
Copy link
Contributor

@senhuang42 senhuang42 commented Jun 7, 2021

Fixes a bug from OSS-Fuzz that occurs in the block splitter: the change to the blockState entropy repeat mode was erroneously happening before ZSTD_blockState_confirmRepcodesAndEntropyTables() and would get overwritten.

Testing - this was never caught by the OSS fuzzer when it was running on the advanced API. But it was caught within a few days of adding block splitter to the old API fuzzers.

@terrelln
Copy link
Contributor

terrelln commented Jun 7, 2021

Testing - this was never caught by the OSS fuzzer when it was running on the advanced API. But it was caught within a few days of adding block splitter to the old API fuzzers.

I haven't looked at the PR yet, but why is this the case? Is it because the fuzzer that caught it doesn't use the advanced API, and we need to extend its coverage?

@senhuang42
Copy link
Contributor Author

senhuang42 commented Jun 7, 2021

I haven't looked at the PR yet, but why is this the case? Is it because the fuzzer that caught it doesn't use the advanced API, and we need to extend its coverage?

The fuzzer that caught it is dictionary_round_trip, which uses compress_usingDict. The dictionary_stream_round_trip fuzzer does use block splitter, but with compressStream2. So maybe it's the case that this repeat mode issue only surfaces on single-shot dict compression? But then again I'm not entirely sure why that would be the case for this particular issue.

@terrelln
Copy link
Contributor

terrelln commented Jun 7, 2021

The fuzzer that caught it is dictionary_round_trip, which uses compress_usingDict

Could we extend it to use the advanced API 1/2 the time? E.g. the same way simple_round_trip does.

@senhuang42
Copy link
Contributor Author

senhuang42 commented Jun 8, 2021

Could we extend it to use the advanced API 1/2 the time? E.g. the same way simple_round_trip does.

I was wrong - dictionary_round_trip only uses the old API 1/16 times, compress2 otherwise.

if (FUZZ_dataProducer_uint32Range(producer, 0, 15) == 0) {

@terrelln
Copy link
Contributor

terrelln commented Jun 8, 2021

Do you think there is a reason why the fuzzer didn't catch it until we added the old API? Or do you think it is just random chance?

@senhuang42
Copy link
Contributor Author

Do you think there is a reason why the fuzzer didn't catch it until we added the old API? Or do you think it is just random chance?

I had investigated a bit, and it's still not really clear to me how this ends up happening, though I'll keep looking. Random chance being the reason seems somewhat unlikely since the old API is only hit with 1/16th of the frequency, and the old API only uses block splitter with high compression levels.

@senhuang42 senhuang42 merged commit d5f3568 into facebook:dev Jun 10, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants