RFC: Structural Records #2584

Centril · 2018-11-02T20:47:28Z

🖼️ Rendered

📝 Summary

Introduce structural records of the form { foo: 1u8, bar: true } of type { foo: u8, bar: bool } into the language. Another way to understand these sorts of objects is to think of them as "tuples with named fields", "unnamed structs", or "anonymous structs".

💖 Thanks

To @kennytm, @alexreg, @Nemo157, and @tinaun for reviewing the draft version of this RFC.
To @varkor and @pnkfelix for good and helpful discussions.

Closure comment

Co-Authored-By: Centril <[email protected]>

Diggsey · 2018-11-02T21:17:54Z

This looks like a super well thought out and comprehensive RFC.

It might be worth clarifying the behaviour of #[derive(...)] with respect to types like:

struct RectangleTidy {
    dimensions: {
        width: u64,
        height: u64,
    },
    color: {
        red: u8,
        green: u8,
        blue: u8,
    },
}

Presumably, there will be no "magic" here, and you will only be able to derive traits which are implemented for the anonymous structs themselves.

Another question is around trait implementations: at the moment, 3rd party crates can provide automatic implementations of their traits for "all" tuples by using macro expansion to implement them for 0..N element tuples. With anonymous structs this will not be possible. Especially with the likely arrival of const generics in the not too distant future, negating the need for macros entirely, anonymous structs will become second class citizens. Is this a problem, and are there any possible solutions to allow implementing traits for all anonymous structs?

petrochenkov · 2018-11-02T21:58:57Z

text/0000-structural-records.md

+    other_stuff(color.1);
+    ...
+    yet_more_stuff(color.2);
+}


fn do_stuff_with((red, green, blue): (u8, u8, u8)) { some_stuff(red); other_stuff(green); yet_more_stuff(blue); }

Even better, with #2522:

fn do_stuff_with((red: u8, green: u8, blue: u8)) { some_stuff(red); other_stuff(green); yet_more_stuff(blue); }

However, while if you write it in this way it is clear what each tuple component means, it is not clear at the call site...

let color = (255, 0, 0); // I can guess that this is (red, green, blue) because that's // the usual for "color" but it isn't clear in the general case.

petrochenkov · 2018-11-02T22:06:06Z

text/0000-structural-records.md

+        blue: u8,
+    },
+}
+```


This syntax looks very similar to another already accepted RFC, but semantics seems different - #2102.

True; I have to think about if some unification can be done in some way here.

Centril · 2018-11-02T22:08:24Z

@Diggsey

This looks like a super well thought out and comprehensive RFC.

Thanks!

It might be worth clarifying the behaviour of #[derive(...)] with respect to types like:
struct RectangleTidy {
    dimensions: {
        width: u64,
        height: u64,
    },
    color: {
        red: u8,
        green: u8,
        blue: u8,
    },
}
Presumably, there will be no "magic" here, and you will only be able to derive traits which are implemented for the anonymous structs themselves.

This bit is not currently well specified, but it should be so I will fix that (EDIT: fixed)... I see two different ways to do it:

Use the structural records in there as types; this means that #[derive(Default)] would produce:
```
impl Default for RectangleTidy {
    fn default() -> Self {
        RectangleTidy {
            dimensions: Default::default(),
            color: Default::default()
        }
    }
}
```
The main benefit of this approach is that it is quite simple to implement. It will also work for all derivable standard library traits because implementations are auto-provided for structural records.
Use the structural records in there as syntax; this means that #[derive(Default)] would produce:
```
impl Default for RectangleTidy {
    fn default() -> Self {
        RectangleTidy {
            dimensions: {
               width: Default::default(),
               height: Default::default(),
            },
            color: {
                red: Default::default(),
                green: Default::default(),
                blue: Default::default(),
            }
        }
    }
}
```
This would need to be done recursively in the macro but it shouldn't be very hard to implement.

The main benefit of this is flexibility. More types would be derivable.

However, due to the automatically provided implementations as noted in 1. there is no benefit to this approach for the standard library so I think the sane behaviour is 1.

That said, a crate like serde can benefit from using the "magical" approach in 2. and we certainly cannot mandate what a procedural macro must do in the language so it will be up to each derive macro to decide what to do.

Another question is around trait implementations: at the moment, 3rd party crates can provide automatic implementations of their traits for "all" tuples by using macro expansion to implement them for 0..N element tuples.

Yup. Usually up to 12, emulating the way the standard library does this.

Especially with the likely arrival of const generics in the not too distant future,
negating the need for macros entirely, anonymous structs will become second
class citizens.

Nit: to not have to use macros for implementing traits for tuples you need variadic generics, not const generics. :)

Is this a problem, and are there any possible solutions to allow implementing traits for all anonymous structs?

I think it might be possible technically; I've written down some thoughts about it in the RFC. However, the changes needed to make it possible might not be what folks want.

However, not solving the issue might also be good; by not solving the issue you add a certain pressure to gradually move towards nominal typing once there is enough operations and structure that you want on the type.

…it)].

petrochenkov · 2018-11-02T22:28:41Z

text/0000-structural-records.md

+standard traits that are implemented for [tuples]. These traits are: `Clone`,
+`Copy`, `PartialEq`, `Eq`, `PartialOrd`, `Ord`, `Debug`, `Default`, and `Hash`.
+Each of these traits will only be implemented if all the field types of a struct
+implements the trait.


That's... quite a bit of magic.

I agree; this is noted in the drawbacks.

Ideas for possible magic reduction discussed below in #2584 (comment).

petrochenkov · 2018-11-02T22:30:52Z

text/0000-structural-records.md

+ For `PartialEq`, each field is compared with same field in `other: Self`.
+
+ For `ParialOrd` and `Ord`, lexicographic ordering is used based on
+  the name of the fields and not the order given because structural records


With macros fields can have names with same textual representation, but different hygienic contexts.

Does that change anything wrt. the implementations provided tho? I don't see any problems with hygiene intuitively, but maybe I can given elaboration?

This is probably relevant to other places where the order of fields needs to be "normalized" as well.

Sure. :) But I'm not sure if you are pointing out a problem or just noting...
Any sorting / "normalization" is done post expansion.

It's not clear to me how to sort hygienic contexts in stable order, especially in cross-crate scenarios. That probably can be figured out somehow though.

petrochenkov · 2018-11-02T22:39:30Z

text/0000-structural-records.md

+```rust
+ty ::= ... | ty_srec ;
+
+ty_srec ::= "{" (ty_field ",")+ (ty_field ","?)? "}" ;


I don't think types can start with { syntactically.
There are multiple already existing ambiguities where parser assumes that types cannot start with {, and upcoming const generics introduce one more big ambiguity ({} is used to disambiguate in favor of const generic arguments).
#2102 uses struct { field: Type, ... } to solve this (and also to discern between struct { ... } and union { ... }).

I'm not sure about expressions/patterns.
Unlimited lookahead may also be required, need to check more carefully.

There are multiple already existing ambiguities where parser assumes that types cannot start with {

Such as? (examples please...)

Const generics allow literals and variables as expressions but anything else needs to be in { ... }.
This is not ambiguous with structural records (because the one-field-record requires a comma...) but requires lookahead.

Unlimited lookahead may also be required, need to check more carefully.

Scroll down ;) It's discussed in the sub-section "Backtracking".

It's discussed in the sub-section "Backtracking".

It's a huge drawback.

Such as? (examples please...)

Nothing that can't be solved by infinite lookahead, but here's "how to determine where where clause ends":

fn f() where PREDICATE1, PREDICATE2, { a: ...

Does { a: ... belong to the function body or to the third predicate?

@laaas Because it is inconsistent with Tuples/Tuple-structs and also less ergonomic.

I don't think it's confusing for readers, the one-field case is clearly disambiguated by an extra comma and that is consistent with how tuples are treated. In other words, if { x, } is confusing, then so is (x,).

If it's going to be drawbacks of this pattern for the structural records, either for slowing parser or cause types can't start with { syntactically, couldn't we use .{ instead?

Inspiration by Zig programming language

Actually, I've been thinking about this recently and I've come to the conclusion that if we lean into the ambiguity, then this is an non-issue.

The need for backtracking here stems from ambiguity with type ascription. This can be fixed by simply assuming that { ident: is always going to be a structural record, either in expression or type position (also solves the const generics issue). That is, if given { foo: Vec<u8> } in an expression context, we will attempt parsing this as a structural record, and then fail at < with a parse error (backtracking can be done inside diagnostics only if we want to). This way, we have retained LL(k) (standing at {, the : is 2 tokens away and so therefore k = 2 in this instance).

If a user wants to disambiguate in favor of ascription, they can write { (ident: Type) }. Disambiguating in this way should not be a problem as { ident: Type } is not a very useful thing to write. Specifically, in the case of PREDICATE2, { a: ..., this should not be a problem because the return type itself is a coercion site and informs type inference.

Why would it fail at <? If you're trying to parse Vec<u8> as an expression, wouldn't the < be parsed as less-than, and you only fail when you see > (and in general arbitrarily far away)?

It seems like the same issue as when we tried to get rid of turbofish...

@comex Oh; yeah, you're right. It should fail at >. Still, we can take a snapshot before trying to parse the expression after ident: and backtrack if it turned out to be a type and provide good MachineApplicable diagnostics here.

It seems like the same issue as when we tried to get rid of turbofish...

Well in the sense that turbofish would also be unnecessary if we are OK with backtracking, but it seems like some people object to backtracking, so I've come up with a solution using LL(k) at the cost of not being able to recognize { binding: Type } as a block with a tail-expression ascribing binding. Because such code is, in my view, pathological, that should be fine. It's much less of an issue than turbofish from a learnability POV imo.

petrochenkov · 2018-11-02T22:48:46Z

text/0000-structural-records.md

+field_init ::= ident | field ":" expr ;
+```
+
+Note that this grammar permits `{ 0: x, 1: y }`.


#1595 was closed, so I'd expect this to be rejected.
This saves us from stuff like "checking that there are no gaps in the positional fields" as well.

Yeah probably; I've left an unresolved question about it.

struct A(u8); let _ = A { 0: 0u32 };

is accepted, so this is currently inconsistent as it stands anyway.

@varkor yeah that was accepted to make macro writing easier (and it totally makes it easier...!)

@petrochenkov I am not sure I understand the reasoning why this is not allowed:

struct Foo { 0: i32, // This is the only thing that fails ... arg_1: i32, named_1: i32, } fn main() { let foo = Foo { 0: 0, arg_1: 1, named_1: 42} ; println!("{}", foo.0); }

fails only at the struct definition with "^ expected identifier". Would it not just be necessary to change 'IDENTIFIER' to 'IDENTIFIER_OR_NUMBER' (pseudo-code wise)?

to make it all work as everything else seems to be able to work with '.0', '0: ', etc.?

I am also not seeing the drawback given that it is already inconsistent ...

My personal motivation with that is to mirror named function arguments in structs as they then perfectly match the:

0: arg_0, 1: arg_1, named_1: arg_3

pattern. Also getting that (0: as allowed identifier in structural records) in independent of structural records is one less thing that structural records need to handle.

To be precise: I am only suggesting to allow numbers as identifiers, nothing else - so there is still plenty of distinction from Tuple to Struct, but they can be 1:1 mapped, but as for every other identifier you need to be explicit about the 0, 1, ...:

Tuple (i32, i32) ~= Struct { 0: i32, 1: i32 }

text/0000-structural-records.md

petrochenkov · 2018-11-02T23:07:30Z

I think the language should be frozen for large additions like this in general, for a couple of years at least.
This is not a crucial feature and not a completion of another crucial feature, motivation seems really insufficient for a change of this scale.

scottmcm · 2018-11-03T01:56:47Z

The trait auto-implementation magic worries me, mostly from the point of view of things not the compiler wanting to implement traits on things. Today crates can do a macro-based implementation for the important tuples, and I can easily see a path to variadic generics where the impls could easily support any tuple. But I can imagine serde really wanting to be able to support these, so I wouldn't want to accept them without some sort of plan -- especially as we're getting closer to finally having good stories for our two other structural type families.

I'd also be tempted to block this on changing struct literal syntax to using = -- I really don't like the lookahead requirement here.

Overall, I think I feel that this is cool, but not necessary.

…lic.

clarfonthey · 2018-11-03T02:37:26Z

I'm very concerned with properties of the struct explicitly relying on lexicographic ordering. I don't think that Ord or PartialOrd should be implemented for structural records-- rather, they should be treated as having unordered fields. If someone wishes to enforce Ord or PartialOrd, I feel that they should be required to either name the struct or use a tuple.

I think that defining lexicographic behaviour for Hash and Debug, however, is completely fine. Hash really only needs to uphold that x == y => hash(x) == hash(y), and any ordering (as long as it's consistent) will satisfy this. Additionally, lexicographic ordering makes the most sense for debugging.

But as a user, I would be surprised to find the below code fail:

let first = { foo: 1, bar: 2 };
let second = { foo: 2, bar: 1};
assert!(first < second);

Centril · 2018-11-03T03:32:47Z

@clarcharr

I've added an unresolved question to be resolved prior to merging (if we do that...) about whether (Partial)Ord should be provided or not. I'm certainly open to your suggestion.

@scottmcm

I'd also be tempted to block this on changing struct literal syntax to using = -- I really don't like the lookahead requirement here.

I would have loved if we had used = instead... but blocking on that seems like blocking on a thing that no one will agree to changing because it would cause churn in more or less every single Rust program/library ever written... ;)

The trait auto-implementation magic worries me, mostly from the point of view of things not the compiler wanting to implement traits on things.

Sure; I entirely agree with that sentiment; it is by far the biggest drawback.

Today crates can do a macro-based implementation for the important tuples, and I can easily see a path to variadic generics where the impls could easily support any tuple. But I can imagine serde really wanting to be able to support these, so I wouldn't want to accept them without some sort of plan -- especially as we're getting closer to finally having good stories for our two other structural type families.

I noted somewhere in the RFC that with the combination of const generics and variadic generics, you may be able to extend this to structural records as well. For example (please note that this is 100% a sketch), using a temporary syntax due to Yato:

impl
    <*(T: serde::Serialize, const F: meta::Field)>
    // quantify a list of pairs with:
    // - a) type variables T all bound by `serde::Serialize`,
    // - b) a const generic variables F standing in for the field of type meta::Field
    //       where `meta` is the crate we reserved and `Field` is a compile time
    //       reflection / polymorphism mechanism for fields.
    serde::Serialize
for
   { *(F: T) }
   // The structural record type; (Yes, you need to extend the type grammar *somehow*..)
{
    // logic...
}

This idea is inspired by Glasgow Haskell's Symbol.

Stargateur · 2021-04-10T12:34:34Z

Ah shame, this would be a nice to have for readability in single-use types, per rfc

the amount of work, and problem this RFC come with is not worth the price.

jtmoon79 · 2022-03-25T03:41:40Z

For anyone that would still like to use Structural Records, as of this writing there are two crates to try:

crate structx. Thanks @oooutlk, (from rust-lang post 58337).
crate named_tuple. Thanks @Flier Lu .

Full Disclosure: I haven't tried either. Just suggestions for anyone that ends up here.

crumblingstatue · 2022-09-15T13:44:12Z

Use case: Named arguments for Fn*

mut drawfn: impl FnMut(&mut Vec<Vertex>, f32, f32, &[u8], usize, Color)

This would be clearer as

mut drawfn: impl FnMut({
        vertex_buffer: &mut Vec<Vertex>,
        x: f32,
        y: f32,
        data: &[u8],
        idx: usize,
        color: Color
})

Rust doesn't have direct support for named arguments in Fn*, but this feature makes it very convenient to emulate them, without having to create distinct types (that the user has to click on in documentation to see the arguments, have to import to use, etc.).

Stargateur · 2022-09-15T13:59:13Z

that the user has to click on in documentation to see the arguments

You would still need to look doc to see name.

have to import to use

not important

etc.

?

without having to create distinct types

I really advice to think about it

struct Draw<'a, 'b> {
  vertex_buffer: &'a mut Vec<Vertex>,
  x: f32,
  y: f32,
  data: &'b [u8],
  idx: usize,
  color: Color,
}

This may seem verbose but that force you to think twice about your api, if this struct look ugly for you this probably mean you could use some structure to better group logical variable (maybe a tuple for (x, y) ?). This also mean your function should reduce the number of argument it's take, these contrains will force you to improve your api.

crumblingstatue · 2022-09-16T00:21:37Z

You would still need to look doc to see name.

Are you taking about the argument names?
Yes, you have to look at the function documentation to see what arguments it takes. But you don't have to leave the page of the function to go to a struct documentation page, just to see what arguments it takes.

not important

That's your opinion. In my opinion ergonomics are important.

This may seem verbose but that force you to think twice about your api, if this struct look ugly for you this probably mean you could use some structure to better group logical variable. This also mean your function could reduce the number of argument it's take, that together will improve your api.

I think my arguments are just fine, thank you. I just want names for them, like you can have for regular functions.
Imagine the documentation nightmare if regular function parameters didn't have names.

Stargateur · 2022-09-16T02:34:37Z

I just want names for them, like you can have for regular functions.

Oh sorry I miss the fact your case is about FnMut Trait. Generally if you use Fn family trait I would advice to keep it very simple, they are mostly for closure that should have 1 or 2 argument max. Thus I maintain my point but I will add that you may use a trait like:

trait Draw {
    fn draw(
        &mut self,
        vertex_buffer: &mut Vec<Vertex>,
        x: f32,
        y: f32,
        data: &[u8],
        idx: usize,
        color: Color,
    );
}

// this allow auto closure impl and function
impl<T> Draw for T
where
    T: FnMut(&mut Vec<Vertex>, f32, f32, &[u8], usize, Color),
{
    fn draw(
        &mut self,
        vertex_buffer: &mut Vec<Vertex>,
        x: f32,
        y: f32,
        data: &[u8],
        idx: usize,
        color: Color,
    ) {
        self(vertex_buffer, x, y, data, idx, color)
    }
}

This would result in mut drawfn: impl Draw, I'm not sure if that would fill your need but you may like this approach.

crumblingstatue · 2022-09-16T06:50:46Z

Generally if you use Fn family trait I would advice to keep it very simple, they are mostly for closure that should have 1 or 2 argument max.

Why this arbitrary restriction? Should regular functions also have 1 or 2 arguments max? If not, what makes closures so different?

but I will add that you may use a trait like

I have no idea what I'm looking at honestly.
It sounds more convoluted than simply taking arguments in a closure. With which the only problem is the lack of naming of arguments, which this feature could help with.
I close my case, as I don't want to derail this thread any further. I presented a potential use case for structural records, everyone can decide for themselves whether it's a good use case or not.

Stargateur · 2022-09-16T13:24:37Z

Why this arbitrary restriction? Should regular functions also have 1 or 2 arguments max? If not, what makes closures so different?

I think yes, function should have max 2, and on few case 3 arguments. If you look at std I think there are few 3 arguments function and probably very few that have more than 3 (exclude self). If a function require a lot of argument it's become difficult to remember argument order and you can easily make mistake, cause, no named argument. (That also why I think function should not have argument of same type for +3 argument count). That also why I disadvise tuple in public api that have +3 elements.

While named argument is a feature very close to structural record we both lack of them in Rust. I argue that it's not a bad thing cause it's force dev to make greater api. With these features a dev could easily choice the easy way to have function with a lot of argument. Most python lib I used are like that, and honestly I don't like it at all, I already used function that have 10 arguments, it's not something I would like in Rust crate.

Rust is build around structural programing, define struct, define trait is rusty. I think Rust lang is doing dev a favor to force them to be verbose, to force them to use structure or trait that structure the code, force you to have strong typing. I think having naming function argument or naming structure record would make more evil than good.

If I would be a user and I see this:

mut drawfn: impl FnMut({
        vertex_buffer: &mut Vec<Vertex>,
        x: f32,
        y: f32,
        data: &[u8],
        idx: usize,
        color: Color
})

I would clearly not like it, there is too many argument, I would probably need to make a function cause a closure would be too big for that, also there would be no direct documentation. My trait solution would solve most of these problems and would allow you a very nice documentation (bonus rust-analyzer would make it very fast to impl it) but I think you could also refine your api, surely data and idx could be linked, and vertex_buffer x y look also linked.

/// write something useful
struct VertexBuf<'a> {
    buf: &'a mut Vec<Vertex>,
    x: f32,
    y: f32,
}

/// write something useful
struct Data<'a> {
    data: &'a [u8],
    i: usize,
}

trait Draw {
    /// write something useful
    fn draw(&mut self, vertex_buffer: VertexBuf, data: Data, color: Color);
}

fn some_function(mut drawfn: impl FnMut(VertexBuf, Data, Color)) {}
// or using trait
fn some_function(mut draw: impl Draw) {}

I close my case, as I don't want to derail this thread any further. I presented a potential use case for structural records, everyone can decide for themselves whether it's a good use case or not.

I think we are on topic, sorry if I started an off topic conversation.

MatrixDev · 2022-09-16T13:56:42Z

I would clearly not like it, there is too many argument

I agree with many arguments in a function is a very bad practice. But forcing user to declare a struct that might be used only in one function is also not an ideal solution.

I think Rust lang is doing dev a favor to force them to be verbose, to force them to use structure or trait that structure the code, force you to have strong typing.

Sometimes arguments are added to a function one-by-one as part of some small commit here and there. In this case, especially on a bigger project, it is very rare what code will be refactored (due to time constraints or other reasons) just to make API look correct. It is probably better to allow at least some way of naming argument in-place to increase overall code readability. Maybe I'm the unlucky one here but on most projects with 30+ people I was on this is the reality.

IMHO named arguments for a function will remove most requirements for the structural records.

crumblingstatue · 2022-09-16T14:57:12Z

I think yes, function should have max 2, and on few case 3 arguments. If you look at std I think there are few 3 arguments function and probably very few that have more than 3 (exclude self).

Counter example: rustc codebase
It has many functions that take a lot of arguments. Here is just 3 picked at random:

The standard library APIs have a fairly simple job of exposing OS APIs and some core data structures.
They don't have to manage complex application state, so they are a bad example.

The rustc developers have better things to do than try to play type tetris to cram everything into structs so everything meets the arbitrary requirement of at most 3 arguments per function.

Stargateur · 2022-09-16T17:44:17Z

@crumblingstatue I don't see how that prove anything or change anything to my point, never I said the quality of rustc implementation was great. Your example only reinforce my opinion that such function is a nightmare to maintain. Also, these functions are mostly call once and are totally different that the use case you bring up and about public API design I try to explain since 3 messages.

I have the feeling you don't understand my points. I will so stop arguing more about this further since this go nowhere with the last two messages.

dnut · 2023-03-20T16:49:23Z

@pnkfelix:
a feature we should have in the long term...
...not something we can justify investing project time on in the foreseeable future.

This is an unusual reason to close an RFC in an open source project. "This is a feature that we should have" warrants a formally positive response, even if it is too costly for a particular set of contributors to justify implementing at this time. My instinct would be to keep the RFC open, but allocate no resources to it. If a perfect implementation falls in our lap, we'd be happy to accept it. Just don't take time away from other work to implement it.

Rust is open source. If prior non-contributors really want to see this feature happen, they might decide to work on it themselves, despite never previously having interest to work on rust. After working on it, they might become regular contributors. But without leaving this window open, they might never join the project.

There is a feedback mechanism between the resources available to a project and the contributions the project is willing to accept. Clearly communicating an openness to improvements, even if most of the current contributors think the improvements are not worth their effort, could increase the amount of resources available to a project, compensating for any incurred costs.

digama0 · 2023-03-21T00:47:16Z

@pnkfelix:
a feature we should have in the long term...
...not something we can justify investing project time on in the foreseeable future.

This is an unusual reason to close an RFC in an open source project. "This is a feature that we should have" warrants a formally positive response, even if it is too costly for a particular set of contributors to justify implementing at this time. My instinct would be to keep the RFC open, but allocate no resources to it. If a perfect implementation falls in our lap, we'd be happy to accept it. Just don't take time away from other work to implement it.

At first I agreed with this point, usually such RFC's are closed as postponed rather than closed. But on reading the history of this issue, I think it is an appropriate choice.

The point is that this is not a change that can be accomplished by an RFC, this needs something closer to a lang team initiative or a working group to get done, because it has far reaching effects on all other parts of the language. And there is not enough bandwidth on the team to consider that in the foreseeable future, and by the time it comes around a new "v2" RFC will be needed anyway, even though it may draw heavily on aspects of this one. A contributor dropping an implementation on the team's lap wouldn't solve all the other issues around interaction with other features (present and future) which are part of the current vision for the language.

I think the best way to a motivated contributor to push this forward would be to drum up support for a working group about it.

scottmcm · 2023-03-21T01:39:49Z

To extend @digama0's point here, I don't think this is a case where, necessarily,

If a perfect implementation falls in our lap, we'd be happy to accept it.

Adding a new fundamental kind of type is a pervasive impact. It's not just a lang question, it's now got type-team impact, opsem-team impact, miri impact, libs-api impact, proc macro impact, different-backends impact, etc.

I pretty much feel the same now as I did two years ago (#2584 (comment)): the way forward is not to try to make the whole new type kind right away, but to look at the motivation and attempt to address some of those pieces in a lower-impact way. What parts of the goodness of this can be done in a way that they can mostly reuse existing inference, existing MIR, existing derives, and the existing type system?

What makes it hard to define a type? What makes it hard to use a type? Can we introduce shorthands for common cases? Could we allow them in more places? Could this mostly be done with Voldemort types, rather than structural ones? Etc.

(Just speaking for me, not stating a lang team position here.)

wongjiahau · 2023-03-21T02:48:55Z

TLDR;

The syntax of strutural and nominal record access should differ so that most changes occur only at the compiler frontend.

Longer version

As far as I know, structural records and tuples carry the same semantics, so why can't we implement structural record as a syntactic sugar of tuples? Doing so requires minimal changes to the backend, i.e. type checking, Miri, code generation etc.

For this to work, the syntax of structural records access has to be different from that of nominal records (the current struct system), so we don't have to incorporate extra mechanisms into the type-checker to disambiguate structural record access from nominal record access, which is probably the main reason why some think this proposal is pervasive.

For the sake of explanation, let's pretend the syntax of structural record access looks like this:

expression '#' identifier

For example:

myObject#hello

Other syntactical aspects of structural records, such as type annotation, construction, and destructuring should already be distinguishable from nominal records, thus they require no syntax changes.

As a proof of concept:

During parsing, desugars structural records construction into tuples construction, for example: { x: 1, y: 2 } desugars into (usize, usize). The value should be sorted by their corresponding key lexicographically.
During type checking, desugars structural record access, for example, expr#x desugars into (expr.0).
2.1. This translation works because the type of expr must be known before any structural record access can be applied.
Structural record destructuring works the same as 2.
For diagnostics purposes, the intermediate representation of a tuple needs to carry additional information, namely structural_record_keymap: Option<StruturalRecordKeymap>.

In conclusion, as long as we keep structural and nominal record access syntax distinct, no pervasive changes are needed.

digama0 · 2023-03-21T03:40:47Z

During parsing, desugars structural records construction into tuples construction, for example: { x: 1, y: 2 } desugars into (usize, usize). The value should be sorted by their corresponding key lexicographically.

During type checking, desugars structural record access, for example, expr#x desugars into (expr.0).
2.1. This translation works because the type of expr must be known before any structural record access can be applied.

If {x: usize, y: usize} is "mere sugar" for (usize, usize), which is to say they are the same type according to the type checker, then {x: usize, y: usize} and {y: usize, z: usize} are the same type. So how would you "desugar" expr#y if expr has the type (usize, usize)? For a more concrete example:

let a = if cond() {
  let b: {x: usize, y: usize} = (0, 1);
  b
} else {
  let c: {y: usize, z: usize} = (2, 3);
  c
};
a#y

Does the last line desugar to a.0 or a.1?

I don't think the expr#x syntax can be sensibly implemented unless structural record types are part of the type system, distinct from tuples (so that we don't lose track of the name -> index mapping). And then we have the indicated challenges wrt type inference, impl coherence, and layout / MIR changes.

wongjiahau · 2023-03-21T04:02:27Z

So how would you "desugar" expr#y if expr has the type (usize, usize)? For a more concrete example:

Sorry, I miscommunicated earlier. For Step 2.1, the desugars only happen after type-checking is completed without error. In other words, the type checker is aware of the existence of structural record type.

let a = if cond() {
  let b: {x: usize, y: usize} = (0, 1);
  b
} else {
  let c: {y: usize, z: usize} = (2, 3);
  c
};
a#y

Does the last line desugar to a.0 or a.1?

This question is invalid because the type of a is unknown, as {x: usize, y: usize} cannot be unified with {y: usize, z: usize}, at least not without structural subtyping, which is coherent with the semantics of tuple as of now, as tuple does not support subtying, for example (usize, usize) cannot be unified with (usize, usize, usize).

So again, most of the changes are on frontend, namely parsing and type checking (or type inference).
Everything beyond type checking is the same, except that the Tuple node has to carry an optional name-index mapping.

digama0 · 2023-03-21T04:51:46Z

If it's a separate type, then this isn't a desugaring at all, as it has to participate in the whole frontend and middle end. The MIR uses the same type system as the type checker (except for erasing regions after borrowck). At that point { x: usize, y: usize } is no more similar to (usize, usize) than struct Foo { x: usize, y: usize } is.

Another issue that arises: Is {x: usize, y: usize} distinct from (usize, usize) for impl coherence purposes?

I believe "no" is unsound, because you can use impl coherence to force the type system to think two generic types are equal, and thereby construct a variation on the example I gave earlier, by proving that {x: usize, y: usize} == {y: usize, z: usize} and then projecting e#y with surprising results.
"Yes" is also problematic for reasons detailed in the RFC: you can't implement traits on all structural types because there is no way you can do something similar to "macro implement for tuples of length up to N" for structural records.

scottmcm · 2023-03-21T07:27:10Z

why can't we implement structural record as a syntactic sugar of tuples?

C# went down this road. It works reasonably well, but it's still risky. For example, this compiles https://dotnetfiddle.net/fYHBJa

	public static (int B, int A) Foo() => (1, 2);
	
	public static void Main()
	{
		(int A, int B) pair = Foo();
		Console.WriteLine(pair);
	}

and doesn't do what you'd want from a proper structural type.

So you can sortof do it, but to make it work well you have to add a ton of random little "oh, wait, we'd better lint that" stuff. Not to mention all the "and what do you get from Any?" and "why don't I get the names in serde?" things.

Thus once you want it to be preserved through closures, say, all the "kinda like a type but isn't" are maybe even worse than having it be a real type in the first place. (This reminds me of all the repr(packed) problems, where there's a bunch of code that has to pass around alignment separately from the type, because fields in packed structs are this kind of "type-like thing but not actually part of the type" problem.)

programmerjake · 2023-03-21T08:51:50Z

edit:
created an internals thread for discussing my desugaring idea: https://internals.rust-lang.org/t/structural-types-again-desugar-to-named-struct-in-std/18550?u=programmerjake

please discuss on internals since here is waay too cluttered

original post:
a quick idea: instead of desugaring to tuples, how about desugaring to an instance of a named struct in std:

// in std::somewhere
pub struct AnonymousStruct<Fields: std::marker::Tuple, const NAMES: &'static [&'static str]>(pub Fields);

impl<Fields: std::marker::Tuple, const NAMES: &'static [&'static str]> AnonymousStruct<Fields, NAMES> {
    /// Self::get_field_index("abc") = Ok(5) means self.abc desugars to self.0.5
    pub const fn get_field_index(name: &str) -> Result<usize, ()> {
        const { assert!(NAMES.is_sorted(), "NAMES must be sorted"); };
        NAMES.binary_search(&name).map_err(|_| ())
    }
}

pseudo-code:

fn desugar_anonymous_struct_type(fields: Vec<(Member, Type)>) -> Result<Type> {
    // sorts by field name
    let mut fields: BTreeMap<Member, Type> = fields.into_iter().collect();
    let mut tuple_fields: Punctuated<Type, _> = (0..).map_while(|i| fields.remove(&Member::Unnamed(i.into()))).collect();
    if fields.is_empty() {
        // actually a tuple, since all fields are named using successive integers
        return Ok(TupleType { paren: Paren::default(), elems: tuple_fields }.into());
    }
    if !tuple_fields.is_empty() {
        bail!("all fields must be named");
    }
    let mut keys = vec![];
    let mut types = vec![];
    for (k, v) in fields {
        let Member::Named(k) = k else {
            bail!("all fields must be named");
        };
        keys.push(LitStr::new(&k.to_string(), k.span()));
        types.push(v);
    }
    Ok(parse_quote!(::std::somewhere::AnonymousStruct<(#(#types,)*), { &[#(#keys),*] }>))
}

I would expect the code for the field access operator . to be structured such that a special case can be easily added for a lhs of AnonymousStruct since it needs to resolve the lhs type anyway in order to resolve existing types' field names.

programmerjake · 2023-03-21T09:03:42Z

desugaring to a named struct in std instantiated with a tuple of the sorted field types and a const generic slice of sorted field names for the anonymous struct neatly solves nearly all the implementation and how-to-impl-traits-for-all-anonymous-structs issues afaict.

programmerjake · 2023-03-21T09:15:41Z

created an internals thread for discussing my desugaring idea: https://internals.rust-lang.org/t/structural-types-again-desugar-to-named-struct-in-std/18550?u=programmerjake

please discuss on internals since here is waay too cluttered

the-a-man-006 · 2024-03-14T12:56:32Z

This sounds so right, guys, please implement it...

Centril and others added 9 commits October 28, 2018 19:13

rfc, structural-records: initial version.

482bc50

rfc/structural-records: fix typo.

a8b1a23

Co-Authored-By: Centril <[email protected]>

rfc, structural-records: fix type theory encoding mishap.

383a8dc

rfc, structural-records: clarify that component types must be *WF*.

add3e0b

rfc, structural-records: elaborate on drawbacks re. orphans.

627f75d

rfc, structural-records: elaborate some more.

d2b61cb

rfc, structural-records: discuss { 0: x, 1: y, .. }

c9b38d7

rfc, structural-records: adjust start date.

9611276

rfc, structural-records: ?Sized -> !Sized.

90ab4e8

Centril added the T-lang Relevant to the language team, which will review and decide on the RFC. label Nov 2, 2018

This was referenced Nov 2, 2018

Language feature: Named variables in function return declarations #1999

Closed

Wishlist: functions with keyword args, optional args, and/or variable-arity argument (varargs) lists #323

Open

petrochenkov reviewed Nov 2, 2018

View reviewed changes

rfc, structural-records: elaborate on behavior re. #[derive(LibstdTra…

0516865

…it)].

petrochenkov reviewed Nov 2, 2018

View reviewed changes

text/0000-structural-records.md Outdated Show resolved Hide resolved

petrochenkov reviewed Nov 2, 2018

View reviewed changes

text/0000-structural-records.md Show resolved Hide resolved

Centril added 2 commits November 3, 2018 03:12

rfc, structural-records: clarify that immediate fields are always pub…

8b948d1

…lic.

rfc, structural-records: note another instance of backtracking.

001c87c

rfc, structural-records: add unresolved question re. (Partial)Ord.

18a2684

rfcbot closed this Apr 10, 2021

taiki-e mentioned this pull request May 25, 2021

add Tuplable tokio-rs/valuable#49

Merged

Centril mentioned this pull request Jan 20, 2022

MESH-1773 & MESH-1777 condition cost function PolymeshAssociation/Polymesh#1204

Merged

Centril mentioned this pull request Sep 7, 2022

Structual records? FuelLabs/sway#2750

Closed

kanashimia mentioned this pull request Jun 21, 2023

Inferred types _::Enum #3444

Open

dhardy mentioned this pull request Nov 8, 2023

Nightly feature usage kas-gui/kas#25

Open

Veetaha mentioned this pull request Sep 8, 2024

feat: allow to access already set fields within pattern of partial shared builder elastio/bon#111

Closed

RFC: Structural Records #2584

RFC: Structural Records #2584

Conversation

Centril commented Nov 2, 2018 • edited by Nemo157 Loading

📝 Summary

💖 Thanks

Diggsey commented Nov 2, 2018 • edited Loading

Choose a reason for hiding this comment

Centril Nov 2, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Centril commented Nov 2, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

petrochenkov Nov 2, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

petrochenkov Nov 2, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Centril Jan 10, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Centril Nov 4, 2018 • edited Loading

Choose a reason for hiding this comment

LionsAd Aug 3, 2020 • edited Loading

Choose a reason for hiding this comment

petrochenkov commented Nov 2, 2018 • edited Loading

scottmcm commented Nov 3, 2018 • edited Loading

clarfonthey commented Nov 3, 2018

Centril commented Nov 3, 2018 • edited Loading

Stargateur commented Apr 10, 2021

jtmoon79 commented Mar 25, 2022 • edited Loading

crumblingstatue commented Sep 15, 2022 • edited Loading

Stargateur commented Sep 15, 2022 • edited Loading

crumblingstatue commented Sep 16, 2022 • edited Loading

Stargateur commented Sep 16, 2022 • edited Loading

crumblingstatue commented Sep 16, 2022

Stargateur commented Sep 16, 2022 • edited Loading

MatrixDev commented Sep 16, 2022

crumblingstatue commented Sep 16, 2022 • edited Loading

Stargateur commented Sep 16, 2022

dnut commented Mar 20, 2023

digama0 commented Mar 21, 2023 • edited Loading

scottmcm commented Mar 21, 2023

wongjiahau commented Mar 21, 2023 • edited Loading

TLDR;

Longer version

digama0 commented Mar 21, 2023 • edited Loading

wongjiahau commented Mar 21, 2023 • edited Loading

digama0 commented Mar 21, 2023 • edited Loading

scottmcm commented Mar 21, 2023

programmerjake commented Mar 21, 2023 • edited Loading

programmerjake commented Mar 21, 2023 • edited Loading

programmerjake commented Mar 21, 2023

the-a-man-006 commented Mar 14, 2024

Centril commented Nov 2, 2018 •

edited by Nemo157

Loading

Diggsey commented Nov 2, 2018 •

edited

Loading

Centril Nov 2, 2018 •

edited

Loading

Centril commented Nov 2, 2018 •

edited

Loading

petrochenkov Nov 2, 2018 •

edited

Loading

petrochenkov Nov 2, 2018 •

edited

Loading

Centril Jan 10, 2020 •

edited

Loading

Centril Nov 4, 2018 •

edited

Loading

LionsAd Aug 3, 2020 •

edited

Loading

petrochenkov commented Nov 2, 2018 •

edited

Loading

scottmcm commented Nov 3, 2018 •

edited

Loading

Centril commented Nov 3, 2018 •

edited

Loading

jtmoon79 commented Mar 25, 2022 •

edited

Loading

crumblingstatue commented Sep 15, 2022 •

edited

Loading

Stargateur commented Sep 15, 2022 •

edited

Loading

crumblingstatue commented Sep 16, 2022 •

edited

Loading

Stargateur commented Sep 16, 2022 •

edited

Loading

Stargateur commented Sep 16, 2022 •

edited

Loading

crumblingstatue commented Sep 16, 2022 •

edited

Loading

digama0 commented Mar 21, 2023 •

edited

Loading

wongjiahau commented Mar 21, 2023 •

edited

Loading

digama0 commented Mar 21, 2023 •

edited

Loading

wongjiahau commented Mar 21, 2023 •

edited

Loading

digama0 commented Mar 21, 2023 •

edited

Loading

programmerjake commented Mar 21, 2023 •

edited

Loading

programmerjake commented Mar 21, 2023 •

edited

Loading