Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

seekable decompression fixes #2594

Merged
merged 4 commits into from
May 5, 2021

Conversation

azat
Copy link
Contributor

@azat azat commented Apr 30, 2021

Changelog:

  • seekable_format: cap the offset+len up to the last dOffset
    This will allow to read the whole file w/o gotting corruption error if
    the offset is more then the data left in file, i.e.:

    $ ./seekable_compression seekable_compression.c 8192 | head
    $ zstd -cdq seekable_compression.c.zst | wc -c
    4737
    

    Before this patch:

    $ ./seekable_decompression seekable_compression.c.zst 0 10000000 | wc -c
    ZSTD_seekable_decompress() error : Corrupted block detected
    0
    

    After:

    $ ./seekable_decompression seekable_compression.c.zst 0 10000000 | wc -c
    4737
    
  • seekable_decompression: break when ZSTD_seekable_decompress() returns zero

  • seekable_decompression_mem: break when ZSTD_seekable_decompress() returns zero

  • seekable_format: fix from-file reading (not in-memory)

@@ -99,6 +99,9 @@ static void decompressFile_orDie(const char* fname, off_t startOffset, off_t end

while (startOffset < endOffset) {
size_t const result = ZSTD_seekable_decompress(seekable, buffOut, MIN(endOffset - startOffset, buffOutSize), startOffset);
if (!result) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice optimization.
Is there more to it ? (does it dodge an error case ?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there more to it ?

Added the same for seekable_decompression_mem.c

(does it dodge an error case ?)

In theory before fixing frame overrun and not using checksums it is possible to go to endless loop

It tries to check the buffer boundary, but there is no buffer for
from-file reading.
This will allow to read the whole file w/o gotting corruption error if
the offset is more then the data left in file, i.e.:

    $ ./seekable_compression seekable_compression.c 8192 | head
    $ zstd -cdq seekable_compression.c.zst | wc -c
    4737

Before this patch:

    $ ./seekable_decompression seekable_compression.c.zst 0 10000000 | wc -c
    ZSTD_seekable_decompress() error : Corrupted block detected
    0

After:

    $ ./seekable_decompression seekable_compression.c.zst 0 10000000 | wc -c
    4737
@azat azat force-pushed the seekable_decompression-fixes branch from 475c49a to 32d0813 Compare April 30, 2021 18:46
@senhuang42 senhuang42 merged commit 53a60e9 into facebook:dev May 5, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants