Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support deriving traits on data-carrying enums #879

Merged
merged 1 commit into from
Sep 10, 2024

Conversation

djkoloski
Copy link
Member

@djkoloski djkoloski commented Feb 14, 2024

This PR is now ready for review.

This PR isn't yet complete. Still to do:

  • Write a padding check for data-carrying enums
  • Add support for deriving IntoBytes using the padding check
  • Add support for deriving FromZeros with a custom type check
  • Un-gate support for deriving Unaligned and FromBytes
  • Write tests for data-carrying enums
  • Justify new unsafe blocks in derived impls

@djkoloski djkoloski force-pushed the data_carrying_enum_support branch 3 times, most recently from 7dde5f6 to a9842ca Compare February 15, 2024 22:47
@djkoloski djkoloski changed the title Support deriving TryFromBytes on data-carrying enums Support deriving traits on data-carrying enums Feb 15, 2024
@codecov-commenter
Copy link

codecov-commenter commented Jul 12, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 88.00%. Comparing base (fb9d62b) to head (19c8fc9).

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #879      +/-   ##
==========================================
+ Coverage   87.91%   88.00%   +0.09%     
==========================================
  Files          16       16              
  Lines        5757     5801      +44     
==========================================
+ Hits         5061     5105      +44     
  Misses        696      696              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@djkoloski
Copy link
Member Author

Blocked on #1515 so we can limit explicit enum discriminants to non-MSRV.

@djkoloski djkoloski force-pushed the data_carrying_enum_support branch 2 times, most recently from 127a90a to d6f8386 Compare July 16, 2024 17:45
@djkoloski djkoloski marked this pull request as ready for review July 16, 2024 17:46
@djkoloski djkoloski requested a review from joshlf July 16, 2024 18:21
@djkoloski djkoloski force-pushed the data_carrying_enum_support branch 4 times, most recently from 36912cc to bd13cf5 Compare July 22, 2024 16:01
@djkoloski djkoloski requested a review from jswrenn July 29, 2024 16:16
zerocopy-derive/src/enums.rs Outdated Show resolved Hide resolved
zerocopy-derive/src/enums.rs Outdated Show resolved Hide resolved
zerocopy-derive/src/enums.rs Outdated Show resolved Hide resolved
zerocopy-derive/src/enums.rs Outdated Show resolved Hide resolved
@djkoloski djkoloski force-pushed the data_carrying_enum_support branch 2 times, most recently from 034b168 to 811f96f Compare August 2, 2024 19:35
@djkoloski djkoloski requested a review from joshlf August 2, 2024 21:04
Copy link
Member

@joshlf joshlf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just reviewing zerocopy-derive/src/enums.rs for now. I'll follow up with reviews of other files.

Could you take a quick look at #367 and see if it'd be easy to add to this PR? If it is, it'd be great to have at least one example output that we can assert against since there are so many moving pieces here.

zerocopy-derive/src/enums.rs Outdated Show resolved Hide resolved
zerocopy-derive/src/enums.rs Outdated Show resolved Hide resolved
zerocopy-derive/src/enums.rs Outdated Show resolved Hide resolved
zerocopy-derive/src/enums.rs Outdated Show resolved Hide resolved
zerocopy-derive/src/enums.rs Outdated Show resolved Hide resolved
zerocopy-derive/src/enums.rs Outdated Show resolved Hide resolved
zerocopy-derive/src/enums.rs Outdated Show resolved Hide resolved
zerocopy-derive/src/enums.rs Outdated Show resolved Hide resolved
zerocopy-derive/src/enums.rs Outdated Show resolved Hide resolved
zerocopy-derive/src/enums.rs Outdated Show resolved Hide resolved
@jswrenn
Copy link
Collaborator

jswrenn commented Aug 7, 2024

I'm seeing compilation errors on this code:

use zerocopy::*;
use zerocopy_derive::*;

#[derive(KnownLayout, TryFromBytes, Immutable)]
#[repr(u8)]
enum MyEnum {
    UnitLike,
    StructLike {
        a: u8,
        b: u16,
    },
    TupleLike(
        bool,
        char,
    ),
}

fn main() {}

The error is:

error[E0382]: use of moved value: `this`
    --> examples/dst.rs:4:23
     |
4    | #[derive(KnownLayout, TryFromBytes, Immutable)]
     |                       ^^^^^^^^^^^^
     |                       |
     |                       `this` moved due to this method call
     |                       value used here after move
     |                       move occurs because `this` has type `Ptr<'_, RawMyEnumVariantTupleLike, (A, invariant::Any, Initialized)>`, which does not implement the `Copy` trait

src/macro_util.rs Outdated Show resolved Hide resolved
src/macro_util.rs Outdated Show resolved Hide resolved
zerocopy-derive/src/enums.rs Outdated Show resolved Hide resolved
zerocopy-derive/src/enums.rs Outdated Show resolved Hide resolved
zerocopy-derive/src/enums.rs Outdated Show resolved Hide resolved
@jswrenn
Copy link
Collaborator

jswrenn commented Aug 8, 2024

This PR is hefty, in part, because it generates ad-hoc is_bit_valid functions, as opposed to delegating to our existing derives. If we can increase the amount of delegation, we can decrease both the PR size and the degree to which we need to reason about unsafe code.

Concretely, I believe we could expand this:

#[derive(KnownLayout, Immutable)]
#[repr(u8)]
enum MyEnum<X, Y> {
    UnitLikeVariant,
    StructLikeVariant {
        a: u8,
        b: X,
    },
    TupleLikeVariant(
        bool,
        Y,
    ),
}

...into this:

unsafe impl<X, Y> ::zerocopy::TryFromBytes for MyEnum<X, Y>
where
    u8: ::zerocopy::TryFromBytes,
    X: ::zerocopy::TryFromBytes,
    bool: ::zerocopy::TryFromBytes,
    Y: ::zerocopy::TryFromBytes,
{
    fn only_derive_is_allowed_to_implement_this_trait() {}

    fn is_bit_valid<A>(
        mut candidate: ::zerocopy::Maybe<'_, Self, A>,
    ) -> ::zerocopy::macro_util::core_reexport::primitive::bool
    where
        A: ::zerocopy::pointer::invariant::Aliasing
            + ::zerocopy::pointer::invariant::AtLeast<
                ::zerocopy::pointer::invariant::Shared,
            >,
    {
        mod tags {
            #[repr(u8)]
            #[allow(dead_code)]
            enum Variants {
                UnitLikeVariant /* = an explicit discr, if provided */,
                StructLikeVariant,
                TupleLikeVariant,
            }

            type Tag = ::zerocopy::macro_util::core_reexport::primitive::u8;

            /// Has the same bit-validity as the tag of
            /// `Self::UnitLikeVariant`.
            #[derive(::zerocopy_derive::TryFromBytes)] // delegate!
            #[repr(u8)]
            pub(super) enum UnitLikeVariant {
                Tag = Variants::UnitLikeVariant as Tag,
            }

            /// Has the same bit-validity as the tag of
            /// `Self::StructLikeVariant`.
            #[derive(::zerocopy_derive::TryFromBytes)] // delegate!
            #[repr(u8)]
            pub(super) enum StructLikeVariant {
                Tag = Variants::StructLikeVariant as Tag,
            }

            /// Has the same bit-validity as the tag of
            /// `Self::TupleLikeVariant`.
            #[derive(::zerocopy_derive::TryFromBytes)] // delegate!
            #[repr(u8)]
            pub(super) enum TupleLikeVariant {
                Tag = Variants::TupleLikeVariant as Tag,
            }
        }

        use ::zerocopy::macro_util::{Variant, core_reexport::marker::PhantomData};

        /// Has the same layout (including bit validity) as
        /// `Self::UnitLikeVariant`.
        #[derive(::zerocopy_derive::TryFromBytes)] // delegate!
        #[repr(C)]
        struct UnitLikeVariant<X,Y>(
            tags::UnitLikeVariant,
            PhantomData<(X, Y)>
        );

        // SAFETY: The layout (including bit validity) of
        // `UnitLikeVariant<X,Y>` is identical to that of
        // `MyEnum<X, Y>::UnitLikeVariant`.
        unsafe impl<X, Y> Variant<MyEnum<X, Y>> for UnitLikeVariant<X,Y>
        {}

        /// Has the same layout (including bit validity) as
        /// `Self::StructLikeVariant`.
        #[derive(::zerocopy_derive::TryFromBytes)] // delegate!
        #[repr(C)]
        struct StructLikeVariant<X,Y>(
            tags::StructLikeVariant,
            u8,
            X,
            PhantomData<(X, Y)>
        );

        // SAFETY: The layout (including bit validity) of
        // `StructLikeVariant<X,Y>` is identical to that of
        // `MyEnum<X, Y>::StructLikeVariant`.
        unsafe impl<X, Y> Variant<MyEnum<X, Y>> for StructLikeVariant<X,Y>
        {}

        /// Has the same layout (including bit validity) as
        /// `Self::TupleLikeVariant`.
        #[derive(::zerocopy_derive::TryFromBytes)] // delegate!
        #[repr(C)]
        struct TupleLikeVariant<X, Y>(
            tags::TupleLikeVariant,
            bool,
            Y,
            PhantomData<(X, Y)>
        );

        // SAFETY: The layout (including bit validity) of
        // `TupleLikeVariant<X,Y>` is identical to that of
        // `MyEnum<X, Y>::TupleLikeVariant`.
        unsafe impl<X, Y> Variant<MyEnum<X, Y>> for TupleLikeVariant<X,Y>
        {}

        false
            || <UnitLikeVariant<X, Y> as Variant<Self>>::is_bit_valid(candidate.reborrow())
            || <StructLikeVariant<X, Y> as Variant<Self>>::is_bit_valid(candidate.reborrow())
            || <TupleLikeVariant<X, Y> as Variant<Self>>::is_bit_valid(candidate.reborrow())
    }
}

...with this additional helper trait added to macro_util:

/// Marks that `Self` has layout (and bit validity) identical to that
/// of a variant of the enum `E`.
/// 
/// # Safety
/// 
/// `Self`'s layout (including bit validity) must be identical to that
/// of a variant of the enum `E`.
pub unsafe trait Variant<E>: Sized {
    fn cast<I>(e: Ptr<'_, E, I>) -> Ptr<'_, Self, (I::Aliasing, invariant::Any, invariant::Any)>
    where
        I: Invariants,
    {
        // SAFETY: By precondition on this trait, `Self` has layout
        // identical to that of a variant of the enum `E`, thus the
        // size of `Self` is identical to that of `E`.
        unsafe { e.cast_unsized(|p| p as *mut Self) }
    }

    fn is_bit_valid<A>(
        candidate: crate::Maybe<'_, E, A>,
    ) -> bool
    where
        A: invariant::Aliasing
            + AtLeast<
                invariant::Shared,
            >,
        Self: TryFromBytes,
    {
        let candidate = Self::cast(candidate);
        // SAFETY: The above cast does not de-initialize the
        // candidate's bytes, thus the `Ptr`'s referent remains
        // initialized.
        let candidate = unsafe { candidate.assume_initialized() };
        TryFromBytes::is_bit_valid(candidate)
    }
}

@djkoloski
Copy link
Member Author

I agree that reusing our existing derives as much as possible is good, but there are costs paid in the efficiency of the code we generate. Because the enum already checks the validity of the tag, we should skip re-checking it for each of the variant structs that we downcast to. Otherwise we're re-checking data that we already know is valid.

@jswrenn
Copy link
Collaborator

jswrenn commented Aug 8, 2024

I don't follow. In my example, there's no 're-checking data we already know to be valid'. The discriminant check happens only after (and because of) the cast.

@djkoloski
Copy link
Member Author

Sorry, I misunderstood your code slightly. In the code you provided, we try checking each of the variants of the enum one-by-one. This means that we do multiple checks on the tag of the enum, each time checking if it matches a single variant's discriminant. This means that 1. we're generating one branch per enum variant and 2. if we match an enum variant and still fail is_bit_valid then we continue checking other variants.

This PR currently reads the tag out once and does a single match on it to delegate to an appropriate is_bit_valid function. This guarantees that the compiler can optimize it to a jump table or branch table. As a result, we need the delegated is_bit_valid functions to avoid re-checking the tag - we wouldn't be calling is_bit_valid on a specific variant if we didn't know that it was the only possible variant that could match.

jswrenn
jswrenn previously requested changes Aug 9, 2024
zerocopy-derive/src/enums.rs Outdated Show resolved Hide resolved
zerocopy-derive/src/enums.rs Outdated Show resolved Hide resolved
zerocopy-derive/src/enums.rs Outdated Show resolved Hide resolved
@jswrenn
Copy link
Collaborator

jswrenn commented Aug 9, 2024

Something I've been doing as I review this is replacing the // SAFETY comments in the derive with " SAFETY: ..."; exprs. These get preserved in the expanded output, so I can evaluate the safety comments in situ.

@joshlf
Copy link
Member

joshlf commented Aug 9, 2024

Something I've been doing as I review this is replacing the // SAFETY comments in the derive with " SAFETY: ..."; exprs. These get preserved in the expanded output, so I can evaluate the safety comments in situ.

Another option would be:

  • Expose safety_comment! as a #[doc(hidden)] #[macro_export] (and maybe move it to live in macro_util)
  • In safety_comment!...
    • Make it optional to actually provide a macro to call (ie, it's acceptable to only provide a doc comment and nothing else)
    • In the macro expansion, emit that doc comment on a const _: () = (); or similar so that it has an item to live on
  • Inside of quote!(...), do:
    safety_comment! {
        /// SAFETY:
        /// Blah blah blah
    }

@jswrenn
Copy link
Collaborator

jswrenn commented Aug 9, 2024

@joshlf I initially did roughly that, but the extra const _: () = ();s are pretty distracting in the expanded output.

Copy link
Member

@joshlf joshlf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've reviewed most of this PR, but still need to review the following files:
Still need to review:

  • zerocopy-derive/src/ext.rs
  • zerocopy-derive/src/repr.rs
  • zerocopy-derive/src/lib.rs
  • zerocopy-derive/tests/*

It may be later today or tomorrow before I get a chance to review those files, so I figured it'd be better for you to at least have these comments now.

src/impls.rs Outdated Show resolved Hide resolved
src/macro_util.rs Outdated Show resolved Hide resolved
src/macro_util.rs Outdated Show resolved Hide resolved
zerocopy-derive/src/enums.rs Outdated Show resolved Hide resolved
zerocopy-derive/src/enums.rs Outdated Show resolved Hide resolved
zerocopy-derive/src/enums.rs Outdated Show resolved Hide resolved
zerocopy-derive/src/lib.rs Outdated Show resolved Hide resolved
zerocopy-derive/src/lib.rs Show resolved Hide resolved
zerocopy-derive/src/lib.rs Outdated Show resolved Hide resolved
zerocopy-derive/src/lib.rs Show resolved Hide resolved
joshlf
joshlf previously requested changes Aug 14, 2024
zerocopy-derive/src/enums.rs Outdated Show resolved Hide resolved
zerocopy-derive/src/enums.rs Outdated Show resolved Hide resolved
zerocopy-derive/src/lib.rs Outdated Show resolved Hide resolved
zerocopy-derive/src/lib.rs Outdated Show resolved Hide resolved
zerocopy-derive/src/repr.rs Outdated Show resolved Hide resolved
zerocopy-derive/tests/enum_from_zeros.rs Outdated Show resolved Hide resolved
@djkoloski djkoloski force-pushed the data_carrying_enum_support branch 4 times, most recently from 88c16dc to f9f2483 Compare September 10, 2024 17:16
Copy link
Member Author

@djkoloski djkoloski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed feedback

@joshlf joshlf dismissed stale reviews from jswrenn and themself September 10, 2024 17:33

Sync'd in person

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: Confirm that UnsafeCell issues are sound

Comment on lines +80 to 86
struct NotTryFromBytes;

#[derive(TryFromBytes)]
#[repr(u8)]
enum TryFromBytes3 {
A(NotTryFromBytes),
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like this isn't generating any error, presumably because the entire compilation is failing during an earlier compiler pass. You will likely need to put this in a separate file.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Comment on lines +127 to +131
#[derive(FromZeros)]
#[repr(u8)]
enum FromZeros6 {
A(NotFromZeros),
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above - needs to be in a separate file.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

#[repr(u8)]
enum FromZeros7 {
A = 1,
B(NotFromZeros),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make this a type that does implement FromZeros to make it clear that we're testing the fact that there's no 0 discriminant?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@@ -160,6 +183,267 @@ enum FromBytes7 {
A,
}

#[derive(FromBytes)]
#[repr(u8)]
enum FooU8 {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Comment on lines +533 to +538
#[derive(IntoBytes)]
#[repr(u8)]
enum IntoBytes1 {
A,
B(u8),
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same above

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Comment on lines +544 to +548
#[derive(IntoBytes)]
#[repr(u8)]
enum IntoBytes2 {
A(Align4IntoBytes),
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Comment on lines +550 to +555
#[derive(IntoBytes)]
#[repr(u32)]
enum IntoBytes3 {
A(u32),
B(u16),
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Member

@joshlf joshlf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're going to follow up with #1634, which will block releasing 0.8.

@joshlf joshlf added this pull request to the merge queue Sep 10, 2024
Merged via the queue into main with commit 87f1925 Sep 10, 2024
147 checks passed
@joshlf joshlf deleted the data_carrying_enum_support branch September 10, 2024 19:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants