Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add bitfields support #3113

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

Andy-Python-Programmer
Copy link

@Andy-Python-Programmer Andy-Python-Programmer commented Apr 22, 2021

This RFC adds support for bit-fields in repr(C) structs by

  • Introducing a new attribute bits(N) that can be applied to integer
    fields.
  • Allowing such annotated fields to be unnamed.

Example:

use core::mem;

#[repr(C)]
struct HbaPrdtEntry {
    data_base_upper: u32,
    data_base_address_upper: u32,
    reserved: u32,

    #[bits(22)] byte_count: u32,
    #[bits(9)] reserved_2: u32,
    #[bits(1)] interrupt_on_completion: u32,
}

assert_eq!(mem::size_of::<HbaPrdtEntry>(), 16);

This pull is reopens the pull #3064 as mahkoh is no longer participating in the Rust community.

Issue: #314

Rendered preview

@leonardo-m
Copy link

Let's also take a look at other designs from other languages, like Ada, etc.

@programmerjake
Copy link
Member

I think we should instead do something like the following:

#[repr(C)]
struct MyStruct {
    #[repr(bitfield(u32))]
    a: uint<5>,
    #[repr(bitfield(u32))]
    b: uint<3>,
    #[repr(bitfield(new, u32))]
    c: uint<12>,
    #[repr(bitfield(u16, new))]
    d: u16,
    e: bool,
    #[repr(bitfield(bool))]
    f: bool,
    #[repr(bitfield(bool))]
    g: bool,
}

which is equivalent to the C struct:

struct MyStruct {
    uint32_t a:5;
    uint32_t b:3;
    uint32_t :0, c:12;
    uint16_t d:16, :0;
    _Bool e, f:1, g:1;
}

This representation gives the advantage that Rust fields have the actual type (e.g. uint<3> instead of weird u32) that can be stored in the struct, instead copying the C mis-step of having a fake type that you can't actually use all the normal values of (e.g. a 5-bit int32_t bitfield can only store -16 through 15, instead of the full expected range of 32-bit values).

I'd expect Rust to get generic integers (uint<N>/int<N>) (for at least <= 128 bits) because bitfields are far from the only place where arbitrary generic integers are quite useful, they are also needed for representing bit-masks for SIMD.

@Lokathor
Copy link
Contributor

"needed" is perhaps an overstatement of the SIMD situation.

As to the RFC:

  • "bit-field fields" is all kinds of poor, i want that bike shed to be some other color.
  • why are we allowing bool fields of more than one bit? We should just disallow that and block off a potential source of UB.
  • since we're adding a new struct style we should just make it be the case that if the struct declaration can't be converted to C it's a compile error.
  • I think many of the future possibility bullets should be resolved in the RFC because their answer affects if we should take any of the alternative paths.

@cjgillot
Copy link

Bitfields are interesting on their own, even without honoring the C layout. They can be used to pack information more densely: for instance, the bitflags crate replacing a struct of bools. At the moment, there is no robust way to pack small enums inside the same byte.
I would suggest:

  • allowing C-like enums in bitfields when the enum's range fits the allocated bits, with no bounds checks ;
  • allowing non-repr(C) bitfields, for Rust-optimized layouts ;
  • allowing bitfields in tuple structs ;
  • allowing bitfields in an enum variant's fields.

@Lokathor
Copy link
Contributor

If you don't need to honor the C layout rules you can do the same effect as bitfields very simply with just a macro_rules or two, and you get a lot more control over it than any set of language rules that would have to fit all situations for all people across the entire language.

So I honestly don't think we need non-repr-C bitfields.

@Andy-Python-Programmer
Copy link
Author

Andy-Python-Programmer commented Apr 26, 2021

allowing C-like enums in bitfields when the enum's range fits the allocated bits, with no bounds checks

I would say this is more likely for a non-safe language to say. If we actually add support for this then we will need to add syntax like unsafe enum/struct.

allowing bitfields in tuple structs

What about transparent tuple structs? We could restrict transparent structs to non-bitfield structs to make this feature happen.

@Lokathor
Copy link
Contributor

I would say this is more likely for a non-safe language to say. If we actually add support for this then we will need to add syntax like unsafe enum/struct.

This is a fairly simple static verification step. The compiler can trivially determine if a given enum has all possible bit patterns of a given bit mask inhabited, and then either allow without bounds check or compile error.

allowing bitfields in tuple structs

Why would this be treated any differently at all from curly brace structs?

@mehcode
Copy link

mehcode commented Apr 26, 2021

For some prior art, I've always loved how C# does this:

[StructLayout(LayoutKind.Explicit, Size = 4)]
struct Foo {
  [FieldOffset(0)]
  public byte bar;

  [FieldOffset(0)]
  public int baz;
}

To guess how we might adapt that to Rust:

#[repr(explicit(size = 4))]
struct Foo {
  #[repr(offset = 0)]
  bar: u8,
 
  #[repr(offset = 0)]
  baz: u32,
}

#[repr(explicit(size = 1))]
struct Flags {
  #[repr(offset = 0, size = 1)]
  one: bool,

  #[repr(offset = 1, size = 1)]
  two: bool,

  #[repr(offset = 5, size = 1)]
  three: bool,
}

To summarize,

Add offset and size to the repr attribute and allow using this on fields when the struct is annotated with repr(explicit).


I do think bit-n integers would fit nicely here but I think that's orthogonal and shouldn't be part of this RFC.

When a field annotated with `bits(N)` is read, the value has the type
of the field and the behavior is as follows:

- The `N` bits of storage occupied by the bit-field are read.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is correct as written. If N is 7 as then we must read (at least 8 bits).

I think it would be better to speak about what won't be read when reading from a bit field.

- If overflow checks are enabled and the value is outside the range of values
that can be read from the field, the overflow check fails.
- The bitmask `(1 << N) - 1` is applied to the value and the remaining `N`
significant bits are written to the storage of the bit-field.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might also include a read. That should be mentioned.

@scottmcm scottmcm added the T-lang Relevant to the language team, which will review and decide on the RFC. label Apr 29, 2021
@anp
Copy link
Member

anp commented Apr 29, 2021

Should the prior art section also cover the bitfield crate? It is quite nice in my experience. I think it would probably also be good for the RFC to explain why it needs to be in the language vs in a crate.

@Lokathor
Copy link
Contributor

I think the bitfield crate, and the fact that you can even make your own alternative if you don't like how it handles things, is proof enough that we don't need language support for repr(rust) bitfields.

However, lang support for repr(C) bitfields brings in a high level of confidence that the compiler will correctly match the layout of the local C ABI when compiling for a target. So to me that's the valuable thing to focus on here.

@ds84182
Copy link

ds84182 commented May 11, 2021

@Lokathor

Macro-generated field emulation (via explicit getter/setter methods) is clunky in comparison to an actual field. Nothing in Stable and Nightly will fix this: there is no DerefMove or DerefSet to simulate properties; there is no replacement for struct construction syntax and patterns. Property-like const fns would be a great addition to the language, for sure.

For layout concerns repr(C) (matching local C ABI) may be desired for FFI, but when interacting with hardware, file formats, etc. a repr(Stable) as in "what you write is what you get", is more valuable. But the latter is not covered by this RFC.


C bitfields are a nightmare. They're tacky and platform dependent. This RFC shouldn't spend time making them ultra-ergonomic to write. Instead, I think this RFC should call out that:

  • This requires compiler support due to how the ABI varies across compilers/calling conventions and architecture.
  • The final size, offset, etc. of the bitfields are an implementation detail matching the local C ABI. Transmuting a bitfield to an integer type is generally undefined behavior, except if properly done on a specific architecture. Some ABIs may add padding at the bit-level, the location of the padding is part of the ABI.
  • Syntactically, the design of C ABI bitfields will not match the theoretical "ergonomic, layout stable" bitfield. Nor will it impact their design. These ideas should be kept separate.
  • Syntax should not fall far from C, the intention is C interop. More verbose syntax prevents hand translation of C bitfields.

@Andy-Python-Programmer
Copy link
Author

C bitfields are a nightmare

Well thats normal as its C xD. Here's list of issues the linux kernel experienced from bitfields betrayed by GCC https://lwn.net/Articles/478657/. We do not want that to happen in rust world :D thats why this ultra-ergonomic write is useful.

@workingjubilee
Copy link
Member

Macro-generated field emulation (via explicit getter/setter methods) is clunky in comparison to an actual field. Nothing in Stable and Nightly will fix this: there is no DerefMove or DerefSet to simulate properties; there is no replacement for struct construction syntax and patterns. Property-like const fns would be a great addition to the language, for sure.

Why is this need best solved by this instead of DerefMove or DerefSet, then?

Comment on lines +19 to +23
The Linux kernel user-space API contains over 400 bit-fields. Writing the
corresponding types in Rust poses significant problems because the layout of
structs that contain bit-fields varies between architectures and is hard to
calculate manually. Consider the following examples which were compiled with
Clang for the `X-unknown-linux-gnu` target:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This brings up a question I have been meaning to ask: aren't C bitfields highly compiler-defined? What parts of bitfield allocation are defined by the implementation of the compiler, and why can we trust the bitfields designed for C interop actually match the implementation for that compiler? I do not believe this is a trivial implementation question we can shrug off for later.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Off the top of my head I have seen compilers change the following between implementations (at least where important for interoperability). Some implementations even allow these to be configured either through compiler flags or #pragmas.

  • default bit order (MSB vs LSB)
  • what integer types are allowed (by default only int and unsigned int are allowed)
  • bit packing (how the fields as defined actually get squished into the bytes)
  • if fields can straddle storage-unit boundaries
    • e.g. with the following bitfield, will the size of the structure be two bytes or three? Assuming LSB allocation order, will b include bits 6..=9 or 8..=11?
    struct example_t {
        unsigned char a: 6;
        unsigned char b: 4;
        unsigned char c: 6;
    }
    

Based on all of this I think if Rust is going to support bitfields natively, we need to select one set of features and stick to it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, that's... somewhat stressful to think about. It sounds like we may have trouble actually finding a rule to adhere to for the Rust compiler that produces a conformant implementation. I found the following in C11, 6.7.2.1:

  1. An implementation may allocate any addressable storage unit large enough to hold a bit-field. If enough space remains, a bit-field that immediately follows another bit-field in a structure shall be packed into adjacent bits of the same unit. If insufficient space remains, whether a bit-field that does not fit is put into the next unit or overlaps adjacent units is implementation-defined. The order of allocation of bit-fields within a unit (high-order to low-order or low-order to high-order) is implementation-defined. The alignment of the addressable storage unit is unspecified.
  2. A bit-field declaration with no declarator, but only a colon and a width, indicates an unnamed bit-field.) As a special case, a bit-field structure member with a width of 0 indicates that no further bit-field is to be packed into the unit in which the previous bit-field, if any, was placed.

The System V AMD64 ABI adds:

  • bit-fields are allocated from right to left
  • bit-fields may share a storage unit with other struct / union member
  • Unnamed bit-fields’ types do not affect the alignment of a structure or union.

And I have taken a selection of things from the Arm64 AAPCS here (it includes a reasonably complete layout calculation algorithm, which I have not quoted in full):

  • For each bit-field, the type of its container is:
    • Its declared type if its size is no larger than the size of its declared type.
    • The largest integral type no larger than its size if its size is larger than the size of its declared type (see Over-sized bit-fields).
  • The container type contributes to the alignment of the containing aggregate in the same way a plain (not bit-field) member of that type would, without exception for zero-sized or anonymous bit-fields.
  • The content of each bit-field is contained by exactly one instance of its container type.
  • For big-endian data types K(F) is the offset from the most significant bit of the container to the most significant bit of the bit-field.
  • For little-endian data types K(F) is the offset from the least significant bit of the container to the least significant bit of the bit-field.
  • The AAPCS does not allow exported interfaces to contain packed structures or bit-fields.

The last one is kind of funny because it makes me wonder if that means we should compile error if someone tries to yield a struct with a bitfield to extern "C" on Arm?

Copy link
Member

@workingjubilee workingjubilee Sep 9, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apparently the issue, as explained to me on Zulip, is actually that "packed structures" or "packed bit-fields" may not be used through "exported interfaces". Here an "exported interface" is anything that gets exposed to other programs via symbols, etc. in the usual manner and thus may have a call site that is not directly controlled by the compiler in question. Importantly, the bit-fields in question for ordinary #[repr(C)] data should be fine.

For those procedures exclusively governed by a single compiler, of course, the compiler can ignore the entire AAPCS anyways, since it is the boundaries between programs (or libraries, etc.) that are what the AAPCS is designed to govern.

@workingjubilee
Copy link
Member

workingjubilee commented Feb 18, 2023

I have been evaluating the landscape of C compilers and have become much more familiar with the standard. The C23 standard is going to land without enormous improvements to the handling of bitfields per se. There will be some improvements to ability to specify some of various sizes, which will give improved programmer control in new declarations if accepted, it will not change existing bitfields in the world of C code. That means that often, bitfields will remain, essentially, implementation defined.

So it is important for this RFC to reflect on what, exactly, it means to be "C-compatible", when "C" does not have one definition, nor even 5 (C89, C99, C11, C17, and C23), but one for every single C compiler and for every single target, combinatorically. Often, repr(C) is used to merely enforce stabilized field layout, but adding bitfields could imply wildly different layouts based on which compiler it was compatible with.

Even when we factor in such things as processor-specific ABIs, often the layout of bitfields is ambiguous at best. That creates a unique problem when this much is left open for implementation definition:

The implementer can change their mind.

And that could undermine the kind of stability guarantees that programmers expect from Rust. In the past, vendor actions have significantly impacted Rust's platform support. However, because they affected OS-specific interfaces, or whether a target existed, or something wrapped in abstraction barriers, this hasn't mattered much to the core of the Rust language.

But this RFC, if accepted, could make those changes cut much deeper. The details of how repr(C) works in the presence of a given struct field impacts Rust's language semantics directly regarding memory layout, in ways that can affect programs and programmers.

In a future where Standard C does specify, exactly, how bitfields should be handled, even just enough that we could believe that compilers would at least reliably come to similar conclusions, this RFC would seem useful. Until then, it seems more appropriate to address the C bitfields problem by providing tools that make it easier to solve this in libraries.

@steffahn
Copy link
Member

repr(C) structs in Rust currently serve a dual role: interop with C on the one hand, and a more strictly-/ well-defined layout (for unsafe code to rely on) on the other hand.

Given that context, I think it’s important that features can be (and are) explained in a way that doesn’t require deep familiarity with other programming languages / with C.

I know some basics of C, but I don’t know anything about “bit-fields” in particular. I am deeply familiar with Rust. With this background, this RFC ready very weird for me, basically I understand almost nothing from the RFC text alone as long as I haven’t read through the complete “reference-level explanation” in detail yet.

But IMO, it certainly shouldn’t be the case that someone deeply familiar with Rust will understand nothing at all from the motivation and guide-level explanations of a RFC alone.

Here’s the limited information that I personally got as an understanding / takeaway from those sections, so you know what you can improve upon.


From the motivation:

The C language has something called “bit-fields”. The Linux kernel uses them. They are hard to understand/calculate/whatever. They have a peculiar syntax using a colon that I’ve never seen before, and they have surprising/weird platform-dependent effects that I cannot even begin to understand without the slightest hint of where this is coming from.

So far this reads like a horribly confusing feature that I wouldn’t want to have in Rust at all if there’s any chance to avoid it and get the necessary interop in a different way. If you want this motivation to give off a different vibe than “wtf is this weirdness I don’t understand it and don’t want it”, then perhaps the motivation should not only motivate “bitfields exist and are hard” but also give some indication why they’re a useful (and thus usef) feature of the C language, what kind of feature they are, the most basic intuition what a “bit-field” even is.

From the guide-level explanation:

The RFC proposes some attribute-based syntax that’s supposed to be an equivalent to the C syntax. As to what the syntax means, I shall better learn some C, I guess?

From skimming through the reference-level explanation:

There’s syntax of course, good, I can skip that, since the fact that there’s a new syntax is the only thing that I did understand in the RFC.

Theres some nomenclature and restrictions... alright, restrictions don’t give me much in terms of what a bit-field is in the first place.

Writing this reply as I’m reading more of that section... finally, this is the first time I come across the most crucial piece of information. This should be among the first sentences of the RFC, but instead it’s well hidden in the middle the “reference-level” section.

Each field annotated with bits(N) occupies N bits of storage.

On that note, the reference-level explanation should probably get some structure that separates the different sections about syntax, restrictions of what types and values can be used, semantics of interoperating with the fields, layout, and possibly more.


Finally, a single note from me about the contents, not the presentation, and I suppose this has been mentioned in the discussion above already, too. I’m perplexed by the premise that

The language reference shall document for each target the layout of structs containing bit-fields.

The intended behavior is that the layout is the same layout as produced by the C system compiler of the target when compiling the corresponding C struct for the same target.

As I mentioned in the beginning of this post, the dual-role of repr(C) in Rust makes a layout that will be heavily dependent on ... well ... as many factors that I cannot even quote them off the top of my head after reading this RFC ... seem quite unfitting, and different from the rest of repr(C) layout. Reading this RFC, this kind of “bitfield” feature would be intended to be used exclusively for C-interop; so it’s comparable to?… well…, probably vararg functions, or extern "C" functions. The difference however is that it proposes a whole language feature, that does seem possibly useful on its own, whereas (as far as I’m aware) varargs are highly unsafe (and despite being supported by Rust FFI, you cannot actually use the feature within Rust very well) and extern "C" is really only an FFI thing.

The fact that the bitfields feature introduces a full language feature that's usable without unsafe and possibly quite useful on its own – outside of FFI considerations – means that making available only the “weirdly platform-dependent C-compatible” way of doing layout seems surprisingly restrictive to me.

@VegaDeftwing
Copy link

VegaDeftwing commented Jan 29, 2024

The current bitfield situation is a potential blocker to using Rust in embedded applications at some organizations cough my employer cough as it makes writing safe, standard, error-free code difficult.

Nearly all C compilers (GCC, Clang, IAR, etc.) support marking bitfields as "Packed" via a #pragma. This tells the compiler to not use the otherwise arcane C rules for formatting bit fields and to not add platform dependent padding. This is used in a large amount of networking code because it means we can trade computation (packing isn't free - more instructions) for denser structures to send over a wire or over the air.

Obviously, we can write macros or functions and a pile of setters and getters and bit math headaches to accomplish this anyway - it's all just bytes at the end of the day. But saying we only need the repr(C) variant to allow for C's mildly cursed padding rules for compatibility isn't right either. So long as the original C is using one of the packing #pragmas, the obvious, not-just-for-compatibility easy to use layout that a Rust oriented new syntax would allow would make translating C code much easier too.

As a bonus, supporting only the not-platform-dependent use case would make it easier for something like bindgen to make something more idiomatic. See https://rust-lang.github.io/rust-bindgen/using-bitfields.html - if you actually run this you'll see it makes quite the mess. - EDIT: It's also already a platform-dependent mess, as bindgen needs to know the target get the padding right: https://rust-lang.github.io/rust-bindgen/faq.html#how-to-generate-bindings-for-a-custom-target

As is, every option leaves a bit to be desired.

  • Not using any crates means harder to write, easier to make mistakes in bit-math code
  • Using any crate means adding a dependency
    • Deku which looks to me to be the nicest bitfield crate requires at least alloc - a potential non-starter for embedded.
    • Modular Bitfield, which appears to be the most popular option has a variety of problems that this blog post outlines nicely.

So while I'm in support of taking time to get this right, saying "We can leave it up to crates" doesn't seem good enough.

@programmerjake
Copy link
Member

programmerjake commented Jan 29, 2024

  • Deku which looks to me to be the nicest bitfield crate requires at least alloc - a potential non-starter for embedded.

from looking at the cargo features and the lib.rs, it looks like it might work without alloc, just don't enable the alloc feature

(edit: nevermind, it documents alloc being required on no_std, imo it should have just always used alloc then and not had that feature gate)

@VegaDeftwing
Copy link

I am distinctly not an experienced Rust dev, but if I had to throw my hat in the ring to recommend syntax, it would probably be something like this:

#[endian(little)]
struct Foo {
    a : bool,
    b : u7,
    #[endian(big)]
    d : u24,
    e : [i16; 4],
    f : [u24; 3],
}

Making this dependent on RFC: Generic integers #2581

Where I think things get a bit gross with this is generics and enums. It might be the case that this makes it hard to verify something is actually %8 bits in size, which should probably be enforced (though that's a tradeoff with composability of structs). Some way to specify how many bits an enum should take would be logical, along with a reasonable solution for dealing with being OOB of that enum. There's also the fun case of how to handle bools - making a 1-bit field that's not a bool feels gross even in C. 🤷‍♂️

@Lokathor
Copy link
Contributor

The current bitfield situation is a potential blocker to using Rust in embedded applications at some organizations ... as it makes writing safe, standard, error-free code difficult.

As the owner of the gba crate, which is a crate for an embedded device, I'm very suspicious of this claim. I've never had a problem with using integer newtypes and bit-field manipulation methods, either in the creation of the type itself (which can be done quite readably with a macro_rules macro) or in using the type (which ends up reading like normal "builder pattern" code, extremely common in Rust). Perhaps I've got an advantage because the MMIO values don't need to be packed into larger structs with target padding, but even so this seems like a fairly simple thing to handle "properly" once and then never think about again. I even made the bitfrob crate so that all the different bit math things I'd need to do have clear names. I generally do agree that needing a dependency is generally worse than having something built into the language or available in core, but I wouldn't call using a dependency a blocker to adopting Rust.

@VegaDeftwing
Copy link

because the MMIO values

For I/O I think it's not unreasonable to do it via bit shifts and what not. When you've got 100+ different packet types for shooting over a network each of different sizes (which may be many, many bytes large) needing to think to construct and destruct them can get quite tedious, and I don't know of a better way to handle it than C bitfields. Again, I'm far from a Rust pro, but when a not insignificant amount of the application logic is processing and handling these packets, it needs to be ergonomic to do work with their data.

@Lokathor
Copy link
Contributor

If you make accessor methods for each bitpacked "field" you want to simulate, then there's a pretty clear conversion that's easy to remember:

access field syntax accessor method
read data.field data.field()
write data.field = new data.set_field(new)

Since bitpacked values aren't really held inside other bitpacked values, this simple rule is enough to handle almost any situation. Even if the overall struct for a situation contains two different bitpacked values, just treat each bitpacked value individually and the problem generally remains manageable.

@workingjubilee
Copy link
Member

workingjubilee commented Jan 30, 2024

Nearly all C compilers (GCC, Clang, IAR, etc.) support marking bitfields as "Packed" via a #pragma. This tells the compiler to not use the otherwise arcane C rules for formatting bit fields and to not add platform dependent padding.

This replaces the "arcane C rules for formatting bit fields" with entirely compiler-specific, non-standard formatting, with no compatibility guarantees. Some platform ABIs go so far as to note that this is hypothetically possible, but it should never be exposed in a public header, ever, and that any such code that does so is nonconforming... right after noting an implementation-defined difference in generated layouts between two C compilers when you do this.

So I disagree with your conclusion:

So while I'm in support of taking time to get this right, saying "We can leave it up to crates" doesn't seem good enough.

...because at least if you use a crate, you have an actual guarantee that you have the same thing on both ends of the wire, as both compilers have to compile Rust correctly. This is the same, basically, as using a C library that does bit-munging exclusively with uint8_t or (uint8_t*, size_t) pairs: the C compiler may not compile such optimally, but at least it will not "miscompile" such because the compiler implemented a compiler-level pragma differently. Your proposed C-level solution requires validating the bit-level layout actually chosen by each compiler... at which point, the amount of validation you are doing means you prooobably could have worked with char* and come out with the same code.

Anyways, if deku using alloc is bad, consider using its dependencies like bitvec more directly (specifically, BitSlice). It is very common in no_std crates to use a simple allocator that allows use of alloc, however.

@workingjubilee
Copy link
Member

workingjubilee commented Jan 30, 2024

...Now, aside from the note that I really hope you aren't trading data between any copies of armcc and gcc over the wire...

@VegaDeftwing In general, because Rust crates have access to procedural macros, which allow for writing significant syntax extension for the language, when we say "we should let crates handle this", it does not necessarily mean modifying the language is inconceivable. It means that it's currently believed that a library can provide a better API, even a better syntax, without having to PR their changes to the compiler, which allows them to iterate independently.

This is not true for all libraries, as not all code can be generated by simply having rustc dlopen() a library and run the TokenStream through the dlopened library. The work on core::simd, for instance, is unfeasible by such a means, as whether the abstract SIMD code optimizes well is part of the question. And e.g. generic integer support would greatly assist writing such a library to begin with. However, given the problems with reference-to-packed-field soundness, the Cell-like API that Lokathor describes is already de-facto mandatory, and no one's particularly arguing against generic integers, just against tying them to the implementation of bitfields.

If your corporation needs a better bit-munging library than currently exists, an obvious route suggests itself: contracting a Rust pro for such and worrying about whether the library is suitable for PRing to rustc later.

@valarauca
Copy link

valarauca commented Jul 11, 2024

The lack of bitfield compatibility of Rust struct/union with C struct/union came up in a recent Usenix paper discussing the key challenges of the integrating Rust into the Linux Kernel. The paper discussing this won best paper at Usenix ATC 2024.

The paper claims this pattern is common in the linux kernel and the lack of a real ABI/FFI compatible solution leads to some non-trivial & measurable overhead when integrating Rust into the Linux Kernel.

There are other challenges, this is just 1 of the key 3 the authors highlighted.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
T-lang Relevant to the language team, which will review and decide on the RFC.
Projects
None yet
Development

Successfully merging this pull request may close these issues.