Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"Conflicting representations" warning became an error #68428

Closed
Manishearth opened this issue Jan 21, 2020 · 24 comments · Fixed by #68586
Closed

"Conflicting representations" warning became an error #68428

Manishearth opened this issue Jan 21, 2020 · 24 comments · Fixed by #68586
Assignees
Labels
P-high High priority regression-from-stable-to-nightly Performance or correctness regression from stable to nightly. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@Manishearth
Copy link
Member

#[repr(u32)]
#[repr(C)]
enum X {
    A, B, C
}

(playground)

This used to be a warning but now is a hard error. This is a breaking change.

This was introduced in 2c3e5d3#diff-9e711d5c109e022cbb94049be8c4ca17 , cc @Centril

@Manishearth Manishearth added the regression-from-stable-to-nightly Performance or correctness regression from stable to nightly. label Jan 21, 2020
@jonas-schievink jonas-schievink added the T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. label Jan 21, 2020
@Centril
Copy link
Contributor

Centril commented Jan 21, 2020

I was originally going to turn this into a lint (#67770 (comment)), but @petrochenkov said we always intended to make this an error. Given that it has been a hard-coded warning for so long (3.5 years), it's probable that crater wouldn't show up many instances, but beta crater runs might show otherwise.

@Manishearth
Copy link
Member Author

We just hit this in the wild, this isn't an uncommon mistake to make since typical FFI advice is to slap repr(C) on everything (otherwise the lint yells at you). Also, crates don't necessarily get updated that often, the one that broke for us is two years old. I'll work on getting it updated, but I suspect this will trigger in other stuff.

We should really really not be making breaking changes like these just off the cuff, especially without a future incompatibility lint, and especially rolled into the depths of some basically unrelated refactoring PR. We have a process for this, and there's a reason behind that process -- we have pretty strong stability commitments.

@Centril
Copy link
Contributor

Centril commented Jan 22, 2020

We should really really not be making breaking changes like these just off the cuff, especially without a future incompatibility lint, and especially rolled into the depths of some basically unrelated refactoring PR.

While it's a "breaking change" in the literal sense of "a change that breaks code" it is also a bug-fix for which a hard-coded warning (which are firmer than lints) has existed for a long time. Our policy around bug-fixes makes no hard requirements that we must have C-future-compatibility lints, that crater must be run, and that fixes cannot be landed immediately. These are made, historically, and contemporarily, based on an assessment of the individual case. That is, we have landed such hard errors immediately before and waited for the beta crater run to point out problems.

@Manishearth
Copy link
Member Author

I don't really want to language lawyer the RFC, but since you bring it up, it says:

The first step then is to evaluate the impact of the fix on the crates found in the crates.io website (using e.g. the crater tool). If impact is found to be "small" (which this RFC does not attempt to precisely define), then the fix can simply be landed
.....
As today, the commit message of any breaking change should include the term [breaking-change] along with a description of how to resolve the problem, which helps those people who are affected to migrate their code. A description of the problem should also appear in the relevant subteam report.

The policy definitely says that we should run crater before knowingly landing breakages. Of course, things can be missed and that's what the beta run is for, but in this case it's pretty clear y'all knew it was a breakage.

The PR body did not even mention this breaking change or that changes of this type were being made, let alone call it out as a breaking change so that others may notice it.

We should be immediately reverting this change, and making a second PR with just this change, the appropriate tags being added, the appropriate discussion happening, and a full crater run being done.

Especially since we already have at least one in-use crate affected by this.


It's also very clear from this text that the RFC focuses primarily on soundness change, with bugs included but being something that needs more per-case judgement. This bug seems rather benign: it doesn't really do anything unexpected (and definitely not something UBish) if you write #[repr(C, u32)] (since you can't do repr(u32) in C directly to match it), it just doesn't really make sense.

cc @nikomatsakis about the intent of the RFC around more benign compiler bugs.


I'm hoping we land rust-lang/rfcs#2834 so that the UX around this gets better as well. Three years is a long time for a crate you maintain, but for "finished" crates (like the googlevr bindings crate that we're using) that are only used as dependencies, it's not really a long time at all.

@nikomatsakis
Copy link
Contributor

My memory is the "intent" of the RFC applied equally to any sort of breaking change. The assumption is that all breaking changes are justified to begin with, so it's only a question of how we go about it. The high-level idea of the RFC is "gently". :)

My position on these sorts of disputes is that if people are complaining, we should typically start by backing out or reversing the "breaking part" of the change, so that we can have the conversation in a "calmer state". As I've said before, my belief is that we can generally get by with less process so long as we're willing to be reasonable when breakage does occur.

That said, I don't feel like I know the particulars of this case. A few questions:

  • The problem is multiple repr attributes applied to the same type? (Are there any special cases, e.g., if the same repr is applied multiple times?)
  • We currently issue a warning? What exactly does the warning say? Is it a future-compatibility warning, with an associated tracking issue or whatever?
  • What was the semantics of multiple repr attributes before the change?
  • (Presumably the effect afterwards is to produce an error.)

@nikomatsakis
Copy link
Contributor

nikomatsakis commented Jan 22, 2020

I guess I will say this. I don't remember the history of this change, and I wouldn't necessarily want to single it out, but it seems to be an example of a change that isn't really necessary -- making this improvement doesn't fix any soundness hole or have any large effect. Changes of that sort make me nervous, though we do them from time to time for sure.

@petrochenkov
Copy link
Contributor

petrochenkov commented Jan 22, 2020

This can always be changed to a deny-by-default compatibility lint if there's breakage in practice.

I do think this is something that need to be fixed because the compiler just cannot satisfy requests like #[repr(u8, u16)] and the actual codegen in that case is unpredictable.

@Manishearth
Copy link
Member Author

There are some cases in this lint where it's unpredictable, but other cases like repr(C, u32) are less problematic (and thus more common as well). #[repr(C, u32)] is something I can easily imagine someone doing because our advice (and our lints!) advise you to tag everything going through the C ABI as repr(C). Sometimes you're actually passing something as a u32, so you need that repr instead. #[repr(u8, u16)] is nonsensical.

There are definitely cases where this error is a lot more important to make happen, but given the care we take in making breaking changes, it seems broader than necessary in that case, and if there is a breaking change here that needs to happen as soon as possible we should make just that change and not the broader change. I think making the broader change is fine as well, but we could make it with a more relaxed timeline, perhaps waiting for rust-lang/rfcs#2834 or maybe an edition. It's not that pressing.

@Manishearth
Copy link
Member Author

The problem is multiple repr attributes applied to the same type? (Are there any special cases, e.g., if the same repr is applied multiple times?)

Yes. It's for specific mixes only:

  • multiple integer types
  • simd + c
  • C + integer, on a C-like enum

In fact, the following is allowed, because of RFC 2195

#[repr(C, u32)]
enum X {
    A, B, C(bool)
}

We currently issue a warning? What exactly does the warning say? Is it a future-compatibility warning, with an associated tracking issue or whatever?

Yes, but it's a normal hardcoded (non-lint) warning.

What was the semantics of multiple repr attributes before the change?

I believe it's unpredictable for some repr attribute combinations, for others it just ignores one or more.

@petrochenkov
Copy link
Contributor

petrochenkov commented Jan 22, 2020

If something is reasonable to write and the compiler can codegen it, e.g. #[repr(C, u32)] enum Foo (with enum class Foo: uint32_t on the other side of FFI), then it's not "conflicting" and it needs to be allowed and not reported as a warning (previously), or error (now), or a lint.

@Manishearth
Copy link
Member Author

Oh, yeah, C++ lets you do that too, so it is in fact reasonable to write that, though that's not quite "repr C" (C doesn't let you), and repr(u32) already gets you the same behavior.

However RFC 2195 seems to define repr(C, u32) as the same as repr(u32) on dataful enums, so it would be nice to have consistency here.

@nikomatsakis
Copy link
Contributor

Naively, I would expect that #[repr(C)] and #[repr(u32)] would be unioned together, given that #[repr(C, u32)] is indeed a thing.

(However, giving it that semantics when it didn't have it before would also be potentially a breaking change.)

@nikomatsakis
Copy link
Contributor

This can always be changed to a deny-by-default compatibility lint if there's breakage in practice.

It seems like @Manishearth is saying that there was breakage in practice, and it's causing them trouble. If it's causing them trouble, it's probably causing others trouble. It seems reasonable then to convert to deny-by-default.

@Manishearth
Copy link
Member Author

If it's causing them trouble, it's probably causing others trouble.

Yeah, that's my main contention, I've already fixed the problem in that crate so the most pressing thing is not an issue anymore, but in general I'm sure I'm not the only one who hit this.

Furthermore, it seems pretty clear that this repr situation is actually somewhat nuanced, and this underscores why we need to have separate explicitly-marked PRs for discussing breaking changes like this; I don't have super strong opinions on whether this thing should be a breakage, but I do think they should be done with careful deliberation. I typically prefer avoiding breakages whenever possible, but my concern is more about how they're done rather than what they are.

@pnkfelix
Copy link
Member

pnkfelix commented Jan 23, 2020

triage: P-high. Nominating for discussion at T-compiler meeting. (Or at least meta-discussion: I want our policy to be clear about situations like this.)

@pnkfelix pnkfelix added I-nominated P-high High priority labels Jan 23, 2020
@pnkfelix
Copy link
Member

pnkfelix commented Jan 23, 2020

assigning task of turning this back into a warning or into a deny-by-default lint to @Centril

@Centril
Copy link
Contributor

Centril commented Jan 28, 2020

I've made this into a deny-by-default C-future-compat lint in #68586.

While I don't feel strongly about this specific case in any way (I was going to make it a lint originally), I agree with @petrochenkov that this is a bug that should be fixed (by making it a hard error eventually) as this is consistent with e.g. #[repr(packed, packed(2))] also being a hard error.

That said, I object to the claims made here around not having followed the process.

First, regarding not having ran crater before, this is hardly the first time that we've done that, so it is reflective of current praxis. Based on an assessment of the individual case, I think it's entirely legitimate to wait for the beta crater run (or someone to make an issue) and then decide to revert, if we believe that an issue is sufficiently unlikely to occur in the wild (3.5 years as a hard-coded error is indicative of this) and if we think that reverting to a lint would be trivial (as is the case here).

As for "soundness" vs. "compiler bug", while the RFC does talk mostly about soundness bugs, as those are the most problematic, nothing in it says that compiler bugs are less allowed. Indeed, the RFC even allows non-bug design defects to be fixed in rare occasions.

Yes, but it's a normal hardcoded (non-lint) warning.

I think this is stronger than a deny-by-default lint except perhaps for the message. It is what we used for the NLL migration specifically because we wanted to migrate more forcefully and quickly.

@nikomatsakis
Copy link
Contributor

Thanks @Centril for doing that.

As for the questions around policy, I believe everyone here is acting in good faith. What I think we're seeing in this issue and others is that our policy can be rather loose and there are some disagreements around just how strict we ought to be. I've been increasingly thinking that it would make sense for us to revisit the policies and see if we can't make them more precise.

Regarding what is correct behavior, one thing that I've been missing here, tbqh, is a good write-up of precisely what conditions are generating the warning and so forth. For example, I've seen @petrochenkov use the example of #[repr(u8, u16)], and I saw @Centril write #[repr(packed, packed(2))], both of which seem plainly incompatible, but the actual example is #[repr(C)] #[repr(u32)], and #[repr(C, u32)] is permitted (and has particular meaning).

This is what I believe to be the case:

  • The lint applies when there are multiple #[repr] attributes.
  • The lint does not apply to incompatible items within a single repr, such as #[repr(u8, u16)].
  • Presently the compiler ignores one of them -- but I'm not sure how predictable it is which one gets ignored. It seems likely to me that we wind up keeping the first/last one or something, but I've not read the code, maybe it's more random than that.

As for multiple repr attributes, I could see three semantics that make sense to me:

  • Pick one predictably and lint that the others are deprecated/ignored
  • Hard error
  • Merge them into a single #[repr] attribute, such that #[repr(C)] and #[repr(u8)] are compatible

Note that "pick unpredictably" is not in that set. =)

Does this sound accurate? Are we really picking unpredictably?

@hanna-kruppe
Copy link
Contributor

The number of repr attributes (#[repr(C)] #[repr(u8)] vs #[repr(C, u8)]) shouldn't matter. Historically there was a bug where the hard-coded warning about incompatible reprs didn't catch conflicting hints that came from different attributes, but I fixed that two years ago and ever since, the lint should treat them the same: whether you spell it as one attribute or two, repr(C, Int) is permitted on enums with fields but considered conflicting on fieldless enums. Meanwhile, repr(u8, u16) is always considered conflicting -- regardless of whether it's one attribute or spread over multiple attributes.

The actual codegen effects are mediated through ReprOptions on the Ty, which effectively takes the union of hints from all #[repr(...)] attributes. The result of conflicting hints depend on what part of ReprOptions any given part of rustc looks at, not (just) on source order. repr(u8, u32) and the like would consistently pick the last one as discriminant type, but repr(C, u8) results in a ReprOptions where repr.c() is true and repr.discr_type() returns u8. While such a ReprOptions is expected and supported on enums with fields, I can't predict what results will fall out of the existing code for fieldless enums.

@Manishearth
Copy link
Member Author

@rkruppe True, but this is not a question of how the compiler handles this today, but rather about how the compiler should handle this. It may be true that the compiler handles this badly, but it can also be true that the compiler shouldn't in this case.

#[repr(C, Int)] seems to me like something we should try to support, for multiple reasons:

  • We recommend you slap #[repr(C)] on everything going through FFI.
  • enum class Foo: int_type exists in C++ and this feels like the obvious choice to represent it
  • It works on dataful enums! It's very weird that this works on dataful enums but not dataless enums, perhaps the only Rust feature I can think of that does that (the opposite is often true)

It should be relatively straightforward to collapse #[repr(C, Int)] into #[repr(Int)] internally for dataless enums in the implementation.

@hanna-kruppe
Copy link
Contributor

My comment was intended to correct errors in @nikomatsakis's description of the status quo:

This is what I believe to be the case:

  • The lint applies when there are multiple #[repr] attributes.
  • The lint does not apply to incompatible items within a single repr, such as #[repr(u8, u16)].
  • Presently the compiler ignores one of them -- but I'm not sure how predictable it is which one gets ignored. It seems likely to me that we wind up keeping the first/last one or something, but I've not read the code, maybe it's more random than that.

This matters because it influences how to proceed, most importantly the "merge [multiple repr attributes] into a single attribute" suggestion is predicated on a wrong assumption. I was not trying to imply anything about what ought to be.

If I must have an opinion on that, I agree that for repr(C, Int) it is most sensible to collapse to repr(Int), and that implementing that should be easy. Another nice thing about this solution is that it makes the memory layout of repr(C, Int) enums without fields match the memory layout of equivalent enums with ZST fields, so in a way it's the correct extrapolation to the fieldless case.

However, FWIW, the ABI would be subtly different:

  • #[repr(C, Int)] enum Foo1 { A(Align1ZST), B } is passed as an aggregate (since it's equivalent to a union of structs containing tag + variant fields)
  • #[repr(C, Int)] enum Foo2 { A, B } is passed as a primitive integer

For many choices of Int and target that doesn't actually cause ABI incompatibilities, but e.g. for Int = u64 on i686-unknown-linux-gnu it does: Foo2 is passed in eax + edx, while Foo1 should be passed on the stack (although #68190 currently masks that). I don't think this overrides the many reasons to treat repr(C, Int) as repr(Int) for fieldless enums, but it is kind of surprising and inelegant, so I wanted to note it.

@Manishearth
Copy link
Member Author

@Centril Thanks for turning it into a future incompat lint.

The process complaint is not just about crater, however. Even if one accepts that skipping crater is the current praxis (as Niko said in general, this is probably something we should explicitly discuss), the fact that this was buried unmarked in the corner of a refactoring PR is a major process violation, IMO. There should always be a future-incompat bug/PR filed if you're aware of a breaking change, period. You're right that there are various arguments to be made for this change "not being a big deal", e.g. it was a warning for three years. These arguments should have been made on such an issue. Making them post-facto isn't helpful, the lang/compiler team should have had an opportunity to discuss this. From the discussion here it's already clear that there are some parts of the change -- specifically around #[repr(C, Int)] that are less clear cut than others (Niko earlier mentioned how he didn't find this change to be necessary), and in the interest of making breaking changes as tightly scoped as possible this breakage could have been left out, if there had been an opportunity for this to be brought up. That's my main concern. Beta crater is also a concern for me, but a lesser one.

I think this is stronger than a deny-by-default lint except perhaps for the message.

I don't see how this is the case from the user's perspective, which is what matters here.

@Centril
Copy link
Contributor

Centril commented Jan 28, 2020

The conversation here regarding the specifics of repr(...) is probably best continued in #68585 as this issue is getting closed when the regression-fix PR lands. Perhaps someone could summarize the technical specifics, e.g. @rkruppe?


@Manishearth

I do not want to belabour the specifics of what happened here too much, but I disagree that filing a C-future-compatibility issue (and presumably waiting with the warning => deny => error procedure) is a requirement (in general, as opposed to this specific case). The stability RFC makes it, in my view, quite explicit that bug fix for which the impact is deemed "small", the fix can be landed immediately.

I don't see how this is the case from the user's perspective, which is what matters here.

The fact that a warning cannot go away, as compared to the norm (lints), would at least make me wonder what is up and make me conclude that they really must not want me to write it this way.

@Manishearth
Copy link
Member Author

I disagree that filing a C-future-compatibility issue (and presumably waiting with the warning => deny => error procedure) is a requirement (in general, as opposed to this specific case)

I did not say that the warning/deny/error procedure is a requirement. I said a very simple thing:

I think at a bare minimum filing the issue is a requirement. I also think that the warning/deny/error pipeline should be the default, but I recognize that there are cases where that isn't really necessary. Arguments to support that should be made in the issue.

The stability RFC makes it, in my view, quite explicit that bug fix for which the impact is deemed "small", the fix can be landed immediately

There was no venue for the team to properly determine if the impact was deemed small. This should exist. And yes, this isn't explicitly listed in the RFC, but that's also been the norm for breaking changes for years.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P-high High priority regression-from-stable-to-nightly Performance or correctness regression from stable to nightly. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants