Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More bitfield brokenness #743

Open
emilio opened this issue Jun 10, 2017 · 7 comments
Open

More bitfield brokenness #743

emilio opened this issue Jun 10, 2017 · 7 comments

Comments

@emilio
Copy link
Contributor

emilio commented Jun 10, 2017

While investigating #739, I found out that the following:

#define POINTER_WIDTH (sizeof(void*) * 8)

struct Foo {
  int dummy;
  unsigned long foo: 1;
  unsigned long bar: POINTER_WIDTH;
};

Generates what I think it's correct code:

#[repr(C)]
#[derive(Debug, Copy)]
pub struct Foo {
    pub dummy: ::std::os::raw::c_int,
    pub _bitfield_1: [u64; 2usize],
    pub __bindgen_align: [u64; 0usize],
}

But apparently isn't, and Clang manages to stick the first bitfield after the int (wat).

The following is also incorrect, generating the right layout but the wrong offset:

struct Foobie {
  int dummy;
  unsigned long foo: 1;
  unsigned long bar: POINTER_WIDTH - 1;
};

We generate an int, then one word when we stick the two bitfields, but clang does the first bit at offset 32 (right after the int), and the other 63 bits after the gap, in the next u64.

Sigh.

@jcranmer
Copy link

jcranmer commented Aug 8, 2017

But apparently isn't, and Clang manages to stick the first bitfield after the int (wat).

The System V ABI for x86-64 says:

bit-fields may share a storage unit with other struct / union members

Which I interpret to mean that bitfields can reuse padding elements. In other words, for bitfield considerations, the struct looks like:

struct Foo {
  unsigned long dummy : 32;
  unsigned long foo : 1; /* Yay, I fit in the upper 32 bits of dummy's 64-bit qword! */
  unsigned long bar : POINTER_WIDTH;
};

I suppose there's a reason why bitfields tend to be buggy in complex ABI situations.

P.S. The MSVC ABI does something very different for bitfield layout.

@emilio
Copy link
Contributor Author

emilio commented Aug 8, 2017

Yeah, I'm aware of the ms_struct, and how it affects layout.

Bitfields are buggy because of how we represent structs. Doing the math to see at which offset the bitfield is is easy (and clang gives us a lot of info), but getting the next field to align properly is somewhat hard, given rust doesn't give a lot of control about it.

We could of course treat structs as a bucket of bytes and do reads at the offsets clang gives us... But that'd be barely usable for normal structs.

We could also use repr(C, packed) all the time and just insert bytes manually where we see fit, but there's people that use pre-generated bindgen files, and that'd suck for them...

@fitzgen
Copy link
Member

fitzgen commented Aug 8, 2017

We could of course treat structs as a bucket of bytes and do reads at the offsets clang gives us... But that'd be barely usable for normal structs.

I don't think this would actually be that terrible if we generated getters and setters.

It would require a breaking version bump, of course.

The other option is to move padding insertion into its own phase before codegen and represent it explicitly in the IR. This should help us see these "oh look at all that room for bitfields" situations better, as well as those bugs where we're conservative about deriving because there could be lots of padding we don't know about yet.

@gfxstrand
Copy link

I've observed this when trying to generate bindings for Mesa's NIR compiler. In particular, for nir_alu_dest which looks like this:

typedef struct {
   nir_dest dest;
   bool saturate;
   unsigned write_mask : 16;
} nir_alu_dest;

Bindgen attempts to align write_mask up to the next multiple of 4B after the boolean:

pub struct nir_alu_dest {
    pub dest: nir_dest,
    pub saturate: bool,
    pub _bitfield_align_1: [u16; 0],
    pub _bitfield_1: __BindgenBitfieldUnit<[u8; 2usize]>,
    pub __bindgen_padding_0: [u8; 5usize],
}

However, this is wrong because C puts it immediately after the boolean. I also have no idea what's up with the 5B padding at the end. That one makes no sense to me at all.

@emilio
Copy link
Contributor Author

emilio commented Dec 7, 2022

@gfxstrand do the layout tests catch this issue? What is nir_dest? The logic for padding lives in bindgen/codegen/struct_layout.rs, and I believe we're hitting the case of "we have more padding than the type is aligned to", but that seems like a consequence of the wrong bitfield layout to begin with.

That said, can mesa use a uint16_t write_mask; there? That'd trivially fix it... But if you can provide a reduced test-case I'm happy to look into the extra padding.

@gfxstrand
Copy link

Sure, I can try to reduce the example a bit. As for uint16_t, yes, that fixes it and I already merged that change so I'm off and running again. 😁 It just makes me a bit nervous given how heavily we use bitfields in Mesa.

@gfxstrand
Copy link

Here's a reduced example:

struct S {
    void *p;
    bool b;
    unsigned u : 16;
};

It produces

#[repr(C)]
#[derive(Debug, Copy, Clone)]
pub struct S {
    pub p: *mut ::std::os::raw::c_void,
    pub b: bool,
    pub _bitfield_align_1: [u16; 0],
    pub _bitfield_1: __BindgenBitfieldUnit<[u8; 2usize]>,
    pub __bindgen_padding_0: [u8; 5usize],
}

which is pretty much the same, down to the weird 5B padding at the end.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants