Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix endianness issues with zstd #12022

Merged
merged 1 commit into from
Aug 30, 2021
Merged

Conversation

rincebrain
Copy link
Contributor

@rincebrain rincebrain commented May 11, 2021

All right, heck it, let's start soliciting feedback.

Motivation and Context

After working around the trivial issue in #12008 (which is also in this PR; I can separate it out if need be), it was discovered that using compress=zstd on any(? at least Linux/ppc64 and Linux/sparc64) big-endian platform results in incorrect storage of the header for the record, such that any similarly incorrect system can read/write the data fine, but "correct" systems (e.g. Linux/x86_64) will EIO on reading that data.

(Fascinatingly, FBSD/ppc64 encodes it differently than Linux/{ppc64,sparc64,x86_64}, such that Linux/{ppc64,sparc64} can read its data and vice-versa, which is why I included samples of all three in the test case in #12030 ) As I commented in #12030, apparently my prior Linux/sparc64 test case was generated by a WIP patchset which mangled bytes differently, since I can no longer reproduce it, so that's been removed.

Description

This PR essentially contains two changes:

  • a simple fix to provide an implementation for two builtins that aren't available on sparc64 when compiling a kernel module
  • (as was pointed out to me in Refactored lzc_send_resume_redacted with a thread. #11992, this is undefined behavior, because I'm defining a thing with two underscores, but the alternative seemed to be changing the "stock" zstd code. Whichever one people think is the lesser of two evils, I'm happy to do.)
  • a heavy-handed fix to the issue of having formerly used varying behavior in the zfs_zstd header, that allows all platforms to read the data written by all other platforms, and makes them all write in the arrangement x86_64 used before this patch.

(It occurs to me that specifying the relevant struct definition here might help. The header is defined as:

typedef struct zfs_zstd_header {
        /* Compressed size of data */
        uint32_t c_len;

        /*
         * Version and compression level
         * We use a union to be able to big endian encode a single 32 bit
         * unsigned integer, but still access the individual bitmasked
         * components easily.
         */
        union {
                uint32_t raw_version_level;
                struct {
                        uint32_t version : 24;
                        uint8_t level;
                };
        };

        char data[];
} zfs_zstdhdr_t;

and a troublesome thing that's done with it is:
hdr->raw_version_level = BE_32(hdr->raw_version_level);
)

We were basically relying on the encoding of the bitfield in memory and serializing that out to disk, resulting in problems for interoperability.

This PR removes the bitfield (but leaves the raw_version_level uint32_t, as among other things a convenient pointer to the right place...), and wraps all access to the formerly-bitfielded contents of that uint32_t with get/set functions that search to determine the correct encoding (in the get case) and use pointer math to lay things out the same way for everyone (in the set case).

To quote the get method's comment, for how we do that:

        /*
         * So we're searching 4 bytes to figure out where the version
         * and level bytes are. The level bytes can cover the whole range
         * of uint8_t, and so are not helpful. Fortunately, the version
         * field is going to have a leading 00 from now until version
         * 6.55.56 or higher with how it's represented, so we can dig
         * that out, and know that wherever we found it, the two bytes
         * "next" to it in the range are the other version bytes, in order,
         * and whichever byte remains is the level field.
         */

Question: Do we want to somehow increment the zstd module version, e.g. 1.4.5a? Because from my own experience, at least DKMS will sometimes report "hey the module version is the same I totally don't have to replace it", and that would lead to all sorts of problems..

Question: Do we want to restrict set to both values at once? Either by making the individual setters private or by making them each individually call the other's zfs_get_hdr...() and call some combination setter that sets both at once - because the code currently only sets them both (on a new hdr, no less), but if one ever got used on a hdr with the incorrect ordering without the other, we'd have lossily mangled that hdr.

How Has This Been Tested?

  • Without the first of the two groups of changes in this PR, confirmed Linux/sparc64 does not compile (but e.g. Linux/ppc64 does).
  • Without the second of these changes, confirmed that a Linux/sparc64 or Linux/ppc64 host would EIO on reading files written by an x86_64 host, and files written by one of them could be read by the other but cause similar EIO on x86_64.
  • With both changes, confirmed both a ppc64 host and sparc64 host could read and write data interoperably with each other and both an unpatched and patched x86_64 host (and vice-versa).
  • Confirmed the changes build and pass the "sendsanity" runfile on a Debian bullseye x86_64 host.
  • Confirmed the changes build and "pass"* the "sendsanity" runfile on a Debian sid ppc64 host.
  • Confirmed the changes build and "pass"* the "sendsanity" runfile on a FreeBSD 13 ppc64 host.
  • Confirmed all the above hosts pass the test added by Added test for being able to read various variants of zstd #12030
  • Confirmed the changes build and can successfully import and read/write in the right order on a Debian sid sparc64 host (who knew, an UltraSPARC IIi is slow...I'll update when it finishes testing.)

("pass" because there are existing failures I should go report that happen on unmodified git master...)

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Performance enhancement (non-breaking change which improves efficiency)
  • Code cleanup (non-breaking change which makes code smaller or more readable)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Library ABI change (libzfs, libzfs_core, libnvpair, libuutil and libzfsbootenv)
  • Documentation (a change to man pages or other documentation)

Checklist:

@koachan
Copy link

koachan commented May 11, 2021

what do we want to do about the people who have been using this on big-endian happily, not noticing the incompatibility?

I happen to run some zstd-compressed datasets on a BE machine at the moment, what can I do to work around this? Is simply changing the compression settings and recompress the data (like with send/recv) enough, or do I need to entirely recreate the affected datasets/pools?

Thanks a lot for the fixes, by the way.

@allanjude
Copy link
Contributor

@koachan the first thing will be to determine if the data written to your disk is actually wrong.

If so, then we will have to somehow indicate this (bump the zstd version?), and have some backwards compat code for when reading blocks labeled 10405, to decode them properly in the cross-endian case, and make all new writes with the corrected code, use a new version indicator.

We might be able to avoid this, if we are sure we are writing the data to the disk the same way in both endians, and then we just have to do the right thing to decode it when reading.

@rincebrain
Copy link
Contributor Author

I was going to wait until I finished testing it on all the relevant arches and pushed the updated branch to note this here, but since you noticed it...

I have an updated branch that allows reading from every endianness mangling I've encountered on x86_64 and passes ZTS without issues. (...and writes the data out in the right order.) Once I've tested it on sparc64+ppc64+FBSD/ppc64, I'll push it.

rincebrain added a commit to rincebrain/zfs that referenced this pull request May 12, 2021
As detailed in openzfs#12022 and openzfs#12008, it turns out the current zstd implementation
is quite nonportable, and results in various configurations of ondisk header
that only each platform can read.

So I've added a test which contains a dataset with a file written by
Linux/x86_64, and one written by current FBSD/ppc64.

(I'll add one for Linux/sparc64 once the unmodified git checkout finishes
building. Linux/ppc64 behaves identically to Linux/sparc64.)

Signed-off-by: Rich Ercolani <[email protected]>
rincebrain added a commit to rincebrain/zfs that referenced this pull request May 12, 2021
As detailed in openzfs#12022 and openzfs#12008, it turns out the current zstd implementation
is quite nonportable, and results in various configurations of ondisk header
that only each platform can read.

So I've added a test which contains a dataset with a file written by
Linux/x86_64, and one written by current FBSD/ppc64.

(I'll add one for Linux/sparc64 once the unmodified git checkout finishes
building. Linux/ppc64 behaves identically to Linux/sparc64.)

Signed-off-by: Rich Ercolani <[email protected]>
rincebrain added a commit to rincebrain/zfs that referenced this pull request May 12, 2021
As detailed in openzfs#12022 and openzfs#12008, it turns out the current zstd
implementation is quite nonportable, and results in various
configurations of ondisk header that only each platform can read.

So I've added a test which contains a dataset with a file written by
Linux/x86_64, and one written by current FBSD/ppc64.

(I'll add one for Linux/sparc64 once the unmodified git checkout
finishes building. Linux/ppc64 behaves identically to Linux/sparc64.)

Signed-off-by: Rich Ercolani <[email protected]>
rincebrain added a commit to rincebrain/zfs that referenced this pull request May 12, 2021
As detailed in openzfs#12022 and openzfs#12008, it turns out the current zstd
implementation is quite nonportable, and results in various
configurations of ondisk header that only each platform can read.

So I've added a test which contains a dataset with a file written by
Linux/x86_64, one for Linux/sparc64, and one written by FBSD/ppc64.

Signed-off-by: Rich Ercolani <[email protected]>
rincebrain added a commit to rincebrain/zfs that referenced this pull request May 12, 2021
As detailed in openzfs#12022 and openzfs#12008, it turns out the current zstd
implementation is quite nonportable, and results in various
configurations of ondisk header that only each platform can read.

So I've added a test which contains a dataset with a file written by
Linux/x86_64, one for Linux/sparc64, and one written by FBSD/ppc64.

Signed-off-by: Rich Ercolani <[email protected]>
@rincebrain rincebrain changed the title Fix(?) endianness issues with zstd Fix endianness issues with zstd May 17, 2021
@rincebrain
Copy link
Contributor Author

rincebrain commented May 17, 2021

All right, a bunch of slow testing later (though not all of it, since sparc64 is amazingly slow...), here's the new branch.

if anyone has better alternatives to what I did in the get/set functions that also passes #12030 (and, ideally, uses one consistent layout on every platform going forward), I'm all ears.

edit: since someone asked me "why can't we just check whether the 00 byte is at the 'head' or the 'tail'?" - as illustrated in the test case in #12030, one of {Linux,FBSD}/ppc64 encodes "zstd level 3, version 10405" as something like 0028a503 and the other encodes it as a5280003. (I do not recall if that's the exact ordering, just that one reversed the order of the bytes in the "version" field while keeping the "level" field in the same location, such that each can read the other's data but zdb will spit out surprising results if asked about it.) Nope, that was presumably generated using a WIP patchset that mangled things further by mistake, because I can't reproduce it, and have dropped it from the test in #12030.

@rincebrain rincebrain marked this pull request as ready for review May 19, 2021 15:38
rincebrain added a commit to rincebrain/zfs that referenced this pull request May 19, 2021
As detailed in openzfs#12022 and openzfs#12008, it turns out the current zstd
implementation is quite nonportable, and results in various
configurations of ondisk header that only each platform can read.

So I've added a test which contains a dataset with a file written by
Linux/x86_64 and one written by FBSD/ppc64.

Signed-off-by: Rich Ercolani <[email protected]>
@tonynguien tonynguien added the Status: Design Review Needed Architecture or design is under discussion label May 24, 2021
module/zstd/zstd_sparc.c Outdated Show resolved Hide resolved
cmd/zdb/zdb.c Outdated Show resolved Hide resolved
@rincebrain rincebrain force-pushed the 12008-sparc64 branch 2 times, most recently from 2718dc7 to aa3d2a5 Compare May 28, 2021 01:25
@rincebrain
Copy link
Contributor Author

There, one big replacement of things with macros later, it seems to work as expected for read and write on Linux/x86_64, Linux/sparc64, and FreeBSD/ppc64, as well as all three producing data readable as expected on unmodified Linux/x86_64.

I also stuck an "a" on the zstd module version.

cmd/zdb/zdb.c Outdated
Comment on lines 2252 to 2270
* So we're searching 4 bytes to figure out where the version
* and level bytes are. The level bytes can cover the whole range
* of uint8_t, and so are not helpful. Fortunately, the version
* field is going to have a leading 00 from now until version
* 6.55.56 or higher with how it's represented, so we can dig
* that out, and know that wherever we found it, the two bytes
* "next" to it in the range are the other version bytes, in order,
* and whichever byte remains is the level field.
*/
uint32_t level = blob->raw_version_level;
uint8_t findme = 0xff;
int shift;
for (shift = 0; shift < 4; shift++) {
findme = BF32_GET(level, 8*shift, 8);
if (findme == 0)
break;
}
switch (shift) {
case 0:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rather than duplicating this code, maybe make one function that gets both level and version?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I did that and added helpers to get one of the two.

cmd/zdb/zdb.c Outdated
case 0:
version = BSWAP_32(version);
version = BF32_GET(version, 8, 24);
break;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From reverse-engineering this code, I think that overall this is decoding a 32-bit number that can look like one of:

0x L0 V0 V1 00
0x V0 V1 00 L0
0x L0 00 V1 V0
0x 00 V1 V0 L0
where 00 is a literal 0, L0 is the level, V0 is the low bits of the version and V1 is the high bits of the version.

I would appreciate seeing this explanation in the comments, as well as an explanation of how the on-disk format could come to be this way.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I'm terribly sorry, I thought your prior request was explicitly "an explanation for why we can't use standard macros for this", which I thought was alleviated by doing so.

I have, to be clear, only seen two of these four cases in the wild (2 for LE and 3 for BE), outside of bytes I mangled with my own code modifications; I just included all the cases (and tested them, with aforementioned manually-induced-mangling) for completeness.

I'll include an explanation for the aforementioned two, but it will amount to "we wrote a representation which is not guaranteed to be consistent across platforms to disk".

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I think we could be more specific, e.g. saying that it used a bitfield and the order of the elements of the bitfield are not specified by the C language, so it might be version-first or level-first. Although in practice you could check if any of our supported platforms actually did it in an other-than expected order, which might simplify this code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've checked Linux/{aarch64,mips64el,ppc64,riscv64,sparc64,x86_64,x86} and FreeBSD/{x86_64,ppc64}, and the only difference is that BE did one case and LE did another. So I could hardcode checks for the two cases instead of all 4, but it doesn't feel like it simplifies much, since it's still going to require a nontrivial explanation for the two different bit configurations.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There, I replaced the existing comment with one with diagrams. Please let me know if it's not clear enough.

Comment on lines 376 to 378
static uint8_t
zfs_get_hdrlevel(const zfs_zstdhdr_t *blob)
{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a way to share this code with zdb's version? Perhaps by putting it in a header file as an inline function? The same could be done for the setting functions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, done.

module/zstd/zfs_zstd.c Outdated Show resolved Hide resolved
*
* 32 24 16 8 0
* +-------+-------+-------+-------+
* | level | 0 | version |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you indicate the high and low bytes of the version, because I think they are swapped here vs line 121. i.e. the (2-byte) "version" here is not the same as the (2-byte) "version" on line 121.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're correct, they are; I thought that the clarity of just saying "version" versus "v0 | v1"/"v1 | v0" was a reasonable tradeoff, but I'll go change it since you disagree.

*
* 32 24 16 8 0
* +-------+-------+-------+-------+
* | version | 0 | level |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that this logic depends on "version" having some bits set in both its low byte and its high byte. It looks like that's the case for the current ZSTD_VERSION_NUMBER, but could we VERIFY() that's the case before we write it out? Otherwise we might accidentally bump to version 1.05.12, which I think would have a zero low byte, and then this logic breaks.

Copy link
Contributor Author

@rincebrain rincebrain Jun 2, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're correct, it does, though I don't think 1.5.12 would do it? If I'm understanding ZSTD_VERSION_NUMBER right, 1.5.12 turns into 1x10000 + 5x100 + 12 = 10512/0x2910 - 1.4.96 would, though, or 1.7.52, or ...

So (unless I'm overlooking something) there's two cases that can happen without the version number rising pretty dramatically far - the low byte is 0x00, or the high byte is 0x00. One effectively can't happen without the version rolling back in time to < 0.02.56 (or forward past 6.55.35, at which point I'd hope we could either drop the compat code or put it behind a non-default tunable), and the other winds up with a layout like 00 v0 00 LV or LV 00 v0 00; in either of the latter two cases, I believe the code will do the right thing no matter which order it iterates (though it should be consistent in order across platforms with this version anyway).

Assuming the above is right, I'm happy to add something to VERIFY that the case it can't handle and can happen (e.g. version >= 6.55.36) isn't true before writing it, though I think I'd prefer (and will unless told otherwise) to make it VERIFY against, say, <= 62000 so it causes breakage well before it's about to break the read code.

* header we're dealing with.)
*
* So now we use the BF32_SET macros to get consistent behavior (the LE
* encoding, since x86 currently rules the world) across platforms, but
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like the code actually uses the big-endian encoding (e.g. hdr->raw_version_level = BE_32(hdr->raw_version_level);). Assuming we plan to maintain this "figure out the byteorder" functionality forever, maybe we should continue to write it in host byteorder, so that there's some chance of exercising this searching code even with pools written by the current software?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the sake of precision - I tested this code against the assumption that it writes out bytes to disk in the same order that x86_64 platforms currently do, by attempting to read the bytes written by my modified code from both {x86_64, sparc64} hosts and by unmodified {x86_64,sparc64} hosts on an unpatched copy of git master on x86_64. Both of the former and the obvious one of the latter work.

Would you prefer I refer to the encoding in question as BSWAP_32(LE), to make it clear that it's the swapped representation of LE's view of the world, not the representation that BE platforms natively have, or something else?

I'd prefer to write only one ordering going forward (though I can straightforwardly go add further test cases to the ZTS PR where the above-discussed property of 00 v0 00 LV and vice-versa are true), so that, if we reach the point where this assertion becomes false (e.g. version > 6.55.35), we can reasonably safely stop using this by default a little while before it becomes a problem.

@rincebrain
Copy link
Contributor Author

I have no strong preference, except in the absence of any preference to just leave it as two separate things rather than flattening them.

@behlendorf
Copy link
Contributor

Works for me. @ahrens if you could take another look at this we should be able to get it wrapped up.

@yhaenggi
Copy link

yhaenggi commented Aug 8, 2021

@ahrens can you take a look at this? thanks

@rincebrain
Copy link
Contributor Author

Gentle ping?

I've just rebased it against master again. It'd be nice for at least all 3 users of ZFS on sparc64 to see this land...

@koachan
Copy link

koachan commented Aug 27, 2021

A question: would a test on virtual disk image/loopback devices be enough?
I'd like to test this but currently I do not have a spare disk that I can use for experimenting, so I'm limited to disk images for now.

Also, is testing that the pool can be imported (+ not get corrupted) on various endianness enough, or do I need to run other test cases too?

@rincebrain
Copy link
Contributor Author

A question: would a test on virtual disk image/loopback devices be enough?
I'd like to test this but currently I do not have a spare disk that I can use for experimenting, so I'm limited to disk images for now.

Also, is testing that the pool can be imported (+ not get corrupted) on various endianness enough, or do I need to run other test cases too?

If you want to experiment with it for some reason, you can test it on a real disk or a file or whatever permutation of storage backend you like. You could even use the test case in #12030 to run through trying both the BE and LE ondisk outcomes.

@behlendorf
Copy link
Contributor

@rincebrain thanks for the reminder. This change looks good to me and I agree we should go ahead and get it merged. Can you just squash this and wrap it up with a nice commit message.

include/sys/zstd/zstd.h Outdated Show resolved Hide resolved
@behlendorf behlendorf added Status: Accepted Ready to integrate (reviewed, tested) and removed Status: Design Review Needed Architecture or design is under discussion labels Aug 27, 2021
It turns out that layouts of union bitfields are a pain, and the
current code results in an inconsistent layout between BE and LE
systems, leading to zstd-active datasets on one erroring out on
the other.

Switch everyone over to the LE layout, and add compatibility code
to read both.

Fixes: openzfs#12008

Signed-off-by: Rich Ercolani <[email protected]>
@rincebrain
Copy link
Contributor Author

Squashed and wrote a brief commit message.

@behlendorf behlendorf merged commit b1a1c64 into openzfs:master Aug 30, 2021
@behlendorf
Copy link
Contributor

Thanks for the squash and comment. Merged!

tonyhutter pushed a commit to tonyhutter/zfs that referenced this pull request Sep 15, 2021
It turns out that layouts of union bitfields are a pain, and the
current code results in an inconsistent layout between BE and LE
systems, leading to zstd-active datasets on one erroring out on
the other.

Switch everyone over to the LE layout, and add compatibility code
to read both.

Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Matthew Ahrens <[email protected]>
Signed-off-by: Rich Ercolani <[email protected]>
Closes openzfs#12008
Closes openzfs#12022
rincebrain added a commit to rincebrain/zfs that referenced this pull request Sep 17, 2021
As detailed in openzfs#12022 and openzfs#12008, it turns out the current zstd
implementation is quite nonportable, and results in various
configurations of ondisk header that only each platform can read.

So I've added a test which contains a dataset with a file written by
Linux/x86_64 and one written by FBSD/ppc64.

Signed-off-by: Rich Ercolani <[email protected]>
jwk404 pushed a commit that referenced this pull request Sep 20, 2021
As detailed in #12022 and #12008, it turns out the current zstd
implementation is quite nonportable, and results in various
configurations of ondisk header that only each platform can read.

So I've added a test which contains a dataset with a file written by
Linux/x86_64 and one written by FBSD/ppc64.

Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: John Kennedy <[email protected]>
Signed-off-by: Rich Ercolani <[email protected]>
Closes #12030
rincebrain added a commit to rincebrain/zfs that referenced this pull request Sep 22, 2021
It turns out that layouts of union bitfields are a pain, and the
current code results in an inconsistent layout between BE and LE
systems, leading to zstd-active datasets on one erroring out on
the other.

Switch everyone over to the LE layout, and add compatibility code
to read both.

Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Matthew Ahrens <[email protected]>
Signed-off-by: Rich Ercolani <[email protected]>
Closes openzfs#12008
Closes openzfs#12022
rincebrain added a commit to rincebrain/zfs that referenced this pull request Sep 22, 2021
As detailed in openzfs#12022 and openzfs#12008, it turns out the current zstd
implementation is quite nonportable, and results in various
configurations of ondisk header that only each platform can read.

So I've added a test which contains a dataset with a file written by
Linux/x86_64 and one written by FBSD/ppc64.

Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: John Kennedy <[email protected]>
Signed-off-by: Rich Ercolani <[email protected]>
Closes openzfs#12030
rincebrain added a commit to rincebrain/zfs that referenced this pull request Oct 2, 2021
It turns out that layouts of union bitfields are a pain, and the
current code results in an inconsistent layout between BE and LE
systems, leading to zstd-active datasets on one erroring out on
the other.

Switch everyone over to the LE layout, and add compatibility code
to read both.

Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Matthew Ahrens <[email protected]>
Signed-off-by: Rich Ercolani <[email protected]>
Closes openzfs#12008
Closes openzfs#12022
rincebrain added a commit to rincebrain/zfs that referenced this pull request Oct 2, 2021
As detailed in openzfs#12022 and openzfs#12008, it turns out the current zstd
implementation is quite nonportable, and results in various
configurations of ondisk header that only each platform can read.

So I've added a test which contains a dataset with a file written by
Linux/x86_64 and one written by FBSD/ppc64.

Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: John Kennedy <[email protected]>
Signed-off-by: Rich Ercolani <[email protected]>
Closes openzfs#12030
tonyhutter pushed a commit that referenced this pull request Nov 1, 2021
It turns out that layouts of union bitfields are a pain, and the
current code results in an inconsistent layout between BE and LE
systems, leading to zstd-active datasets on one erroring out on
the other.

Switch everyone over to the LE layout, and add compatibility code
to read both.

Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Matthew Ahrens <[email protected]>
Signed-off-by: Rich Ercolani <[email protected]>
Closes #12008
Closes #12022
tonyhutter pushed a commit that referenced this pull request Nov 1, 2021
As detailed in #12022 and #12008, it turns out the current zstd
implementation is quite nonportable, and results in various
configurations of ondisk header that only each platform can read.

So I've added a test which contains a dataset with a file written by
Linux/x86_64 and one written by FBSD/ppc64.

Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: John Kennedy <[email protected]>
Signed-off-by: Rich Ercolani <[email protected]>
Closes #12030
tonyhutter pushed a commit to tonyhutter/zfs that referenced this pull request Nov 13, 2021
It turns out that layouts of union bitfields are a pain, and the
current code results in an inconsistent layout between BE and LE
systems, leading to zstd-active datasets on one erroring out on
the other.

Switch everyone over to the LE layout, and add compatibility code
to read both.

Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Matthew Ahrens <[email protected]>
Signed-off-by: Rich Ercolani <[email protected]>
Closes openzfs#12008
Closes openzfs#12022
tonyhutter pushed a commit to tonyhutter/zfs that referenced this pull request Nov 13, 2021
As detailed in openzfs#12022 and openzfs#12008, it turns out the current zstd
implementation is quite nonportable, and results in various
configurations of ondisk header that only each platform can read.

So I've added a test which contains a dataset with a file written by
Linux/x86_64 and one written by FBSD/ppc64.

Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: John Kennedy <[email protected]>
Signed-off-by: Rich Ercolani <[email protected]>
Closes openzfs#12030
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Accepted Ready to integrate (reviewed, tested)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants