Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

distinct 'static' items never overlap #1657

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

RalfJung
Copy link
Member

@RalfJung RalfJung commented Oct 20, 2024

It seems like so far we did not actually guarantee this.

While we are at it, also clarify that static initializers can read even mutable statics, and what happens in that case.

@workingjubilee
Copy link
Member

I don't think you understood my example? I was positing that this seems to be a legal interpretation of the reference's current text:

static ZEE: u8 = 0;
static ZED: u8 = 0;
assert_eq!(&raw const ZEE, &raw const ZEE);
assert_eq!(&raw const ZED, &raw const ZED);
assert_eq!(&raw const ZEE, &raw const ZED);

@workingjubilee
Copy link
Member

Because this

I would say if a static has a wibbly wobbly address not equal to itself, that's not a "precise memory location". We don't even have addresses that are not equal to themselves.

cannot be a response to what I actually said if it is said with an understanding of what I was trying to say. :/

@RalfJung
Copy link
Member Author

RalfJung commented Oct 20, 2024

I think I understood the example? I don't understand how that can be a valid interpretation of the text. A static has a location, &raw const gives you a pointer pointing there. Different locations compare inequal.

@RalfJung
Copy link
Member Author

RalfJung commented Oct 20, 2024

Oh, maybe I didn't quite understand the example.

But statics are certainly intended to be unique and disjoint. That's their point -- they describe a place, distinct from all other places.

More specifically, statics form their own allocated objects that don't overlap with any other allocated object. So in fact ZST statics are not quite unique -- but statics of type i32 are guaranteed to be at least 4 apart.

@RalfJung
Copy link
Member Author

@rustbot label +T-opsem

@rustbot rustbot added the T-opsem Team: opsem label Oct 20, 2024
@RalfJung
Copy link
Member Author

@rfcbot merge
since so far it seems like we haven't actually documented "different statics are disjoint".
Cc @rust-lang/lang

@rfcbot
Copy link

rfcbot commented Oct 20, 2024

Team member @RalfJung has proposed to merge this. The next step is review by the rest of the tagged team members:

Concerns:

Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

cc @rust-lang/lang-advisors: FCP proposed for lang, please feel free to register concerns.
See this document for info about what commands tagged team members can give me.

@RalfJung RalfJung changed the title attempt to clarify 'static' unique address guarantees distinct 'static' items never overlap Oct 20, 2024
@JakobDegen
Copy link

Lgtm, with the note that it's important this language remains restricted to static items, and not other language constructs that produce statics - const promotion, vtables, functions, etc. Will check my box when I'm off mobile, as apparently the GH app doesn't let you edit things anymore (or someone can do it for me :) )

@RalfJung
Copy link
Member Author

You probably don't have edit rights here. I don't.

You can use @rfcbot reviewed to check your box.

with the note that it's important this language remains restricted to static items, and not other language constructs that produce statics - const promotion, vtables, functions, etc.

I would argue those aren't statics, they are other kinds of global allocations -- exactly because of this fundamental difference.

@JakobDegen
Copy link

@rfcbot reviewed

@workingjubilee
Copy link
Member

workingjubilee commented Oct 20, 2024

hmm. how must the addressing work for this, then?

static BLAH: &str = "blah";
static ALSO_BLAH: &str = "blah";

Are these potentially two different pointers to the same string literal?

@RalfJung
Copy link
Member Author

Are these potentially two different pointers to the same string literal?

Yes.

@saethlin
Copy link
Member

@rfcbot reviewed

@rfcbot
Copy link

rfcbot commented Oct 20, 2024

🔔 This is now entering its final comment period, as per the review above. 🔔

psst @RalfJung, I wasn't able to add the final-comment-period label, please do so.

@digama0
Copy link

digama0 commented Oct 20, 2024

@rfcbot reviewed

Comment on lines 12 to 13
program that is initialized with the initializer expression. This allocated object is disjoint from
all other allocated objects. All references and raw pointers to the static refer to the same
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to guarantee that they are disjoint from other allocated objects? Or just other statics. IE. if I have:

static FOO: i32 = 0;
const BAR: i32 = 0;

fn foo(){
    assert!(!core::ptr::eq(&FOO, &BAR));
}

Should we guarantee the assertion will always pass?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have concern permissions, but I think this should be addresses by T-opsem before the end of the FCP.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I think we should guarantee that, and it's what the text already says, isn't it?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That prevents emitting unnamed_addr on const items.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unnamed_addr specifically does coalescing of non-significant-address-items into each other, and potentially into a significant-address item.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The LangRef says:

Note that a constant with significant address can be merged with a unnamed_addr constant, the result being a constant whose address is significant.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, that is unfortunate.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I think we should guarantee that, and it's what the text already says, isn't it?

That's what the proposed text says, my question is whether that's what we want it to say.

Copy link
Member

@workingjubilee workingjubilee Oct 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I... honestly don't think that's surprising at all?

That's exactly what I would expect in a case of

const BIG_CONST: BigFrozen = BigFrozen::big_init();
static BIG_STATIC: BigFrozen = BIG_CONST;

That's the precise situation where the const will be unified "into" the static, and where it would be a clearly beneficial optimization.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I... honestly don't think that's surprising at all?

I guess we have different intuitions then.

But reality clearly works one way here, so I have updated the text to match reality. @chorman0773 please have a look.

@CAD97
Copy link

CAD97 commented Oct 20, 2024

@rfcbot reviewed

@workingjubilee
Copy link
Member

I have a concern: this proposed definition prevents emitting optimization annotations on non-static items that we may wish to have coalesced via optimizations that exploit the fact that const items have a non-significant address. Even if we wish to guarantee static unicity, it seems pointlessly penalizing to make this guarantee affect the ability to optimize, quite literally, the items that don't have unique addresses.

@saethlin
Copy link
Member

@rfcbot concern clarify-optimization-of-consts

I think the proposed text and the previous text have the same meaning, which is unfortunate because I think that we'd already specified that consts are not merged into statics. But given the participation on this PR, and its title which seems to be about overlap of statics, and the fact that I think rustc currently does not implement what is documented here, I would like the implications for consts to be spelled out clearly.

@RalfJung
Copy link
Member Author

RalfJung commented Oct 25, 2024

But given the participation on this PR, and its title which seems to be about overlap of statics, and the fact that I think rustc currently does not implement what is documented here, I would like the implications for consts to be spelled out clearly.

I gave that a shot, please have a look.

@saethlin
Copy link
Member

@rustbot resolve clarify-optimization-of-consts

Copy link
Contributor

@chorman0773 chorman0773 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given the concern is now resolved, I see no further issues with this.

@workingjubilee
Copy link
Member

workingjubilee commented Nov 7, 2024

I'm not sure a prohibition here would do anything to users, only to implementations

Hmm. One of my preferred mental models for Rust is one where the compiler is a very trustworthy Rust programmer who writes an unnerving amount of unsafe code, and especially asm!. Likewise, I sometimes explain that the rules of unsafe can often be thought of as manually upholding the rules the compiler also obeys.

And I would find it hard to explain to people who enjoy playing linker tricks with static items why they shouldn't arrange for a given static ZST to be found at an address that might be within a given object, considering it occupies zero bytes. Occasionally they actually link static objects of non-zero size to be potentially-overlapping, and then often a conversation starts where we explain that the bytes of statics can't really overlap like that. But they need the ability to point at addresses somehow. Sometimes we have explained they should use extern static, but the conversation usually arrives at something like "but linking in a separate crate just to make these statics extern is super annoying!" And if a ZST suits their needs of being such a "marker", that seems well-enough.

Copy link
Contributor

@chorman0773 chorman0773 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Late on this, but my concern earlier was resolved, so approving now.

@workingjubilee
Copy link
Member

Note I am happy to be corrected on this if all uses of ZST markers by people engaging in linker-script-shenanigans are compatible with them being ZSTs that are never "within" another object, but my memory was that trick was used in a few ways and that they were always at risk of this being a problem, i.e. they might want to put one or both of those markers inside a larger static.

@CAD97
Copy link

CAD97 commented Nov 7, 2024

I think that allowing zero-sized static places to have an address strictly inside the address range of a different static is the slightly better choice. Because users can place statics at custom addresses by utilizing linker scripts, this isn't solely a question of restricting the implementation, but also one of restricting users. If implementation considerations are otherwise immaterial, the option with less potential UB should be preferred.

Zero-sized accesses/places are already special-cased, so special casing zero-sized static in the rule to explicitly exempt them from overlap rules doesn't seem unreasonable. But there are two very straightforward formulations that don't need to do so:

  • static items cannot alias other static items (using the definition of aliasing from the reference aliasing model).
  • No byte of memory is contained in more than one static item.

@nikomatsakis
Copy link
Contributor

I'm finding this conversation interesting but I also feel a bit of a hostile tone that I am finding confusing. This is not the only instance, but one example is where @workingjubilee wrote the following:

Can T-lang explain the actual utility of this prohibition?

The actual comment in question said that the text as written was ambiguous and asked for clarification. Certainly some people in the discussion felt that having the start address of a ZST static lie within another static ought not to be allowed but others did not, and hence we decided to ask that the text either not answer the question or provide an answer.

I'm highlighting this because lately I've noticed a lot of technical discussions that feel heated to me and it's starting to wear me out. It would be really helpful to me at least if we can try to explore the pros/cons of this question and others in a more neutral way.

@nikomatsakis
Copy link
Contributor

nikomatsakis commented Nov 7, 2024

My two cents:

I agree with @workingjubilee's model of the compiler as "just another Rust programmer", that's how I like to think of it too (and I like to think of the type system as "guardrails" that keep you on the straight-and-narrow, but which you can choose to take off).

I don't know of specific utility from defining overlap based on the "start address", though I will say that I initially found it surprising to think of the address of a ZST-static falling inside another one, and so I can imagine people making that (perhaps incorrect) assumption. All this says is that we should be clear to document it whichever way we decide.

I am curious what people think about the uniqueness of ZST-statics as well. I think if I just consider the pointer value of a ZST-static to be effectively meaningless and arbitrary, which seems to be implied by definitions of overlap that focus on actual bytes, then it all makes a lot more sense to me and I agree seems consistent.

@pnkfelix
Copy link
Member

pnkfelix commented Nov 7, 2024

@nikomatsakis wrote:

The actual comment in question said that the text as written was ambiguous and asked for clarification. Certainly some people in the discussion felt that having the start address of a ZST static lie within another static ought not to be allowed but others did not, and hence we decided to ask that the text either not answer the question or provide an answer.

Just to emphasize this point further: Niko's original comment neglected to include one point that I think all of T-lang agreed upon during our meeting yesterday: During our conversation, I believe the whole team agreed that we would support allowing this PR to make progress if it simply restricted the guarantee to be solely regarding interactions between positively-sized statics.

I.e., that would allow us leave the question of whether ZSTs also participated in the guarantee to be a matter we can resolve later.

(Obviously this would have the short-term effect of making the guarantees of zero-sized static items an implementation-specific detail that a programmer cannot rely upon (e.g. when writing unsafe code or linker scripts or whatnot), which is not an ideal state of affairs for the long-term. But it seemed to me from skimming the conversation in this thread that the participants here mostly cared about strengthening the guarantees for sets of positively-sized static items, and thus this compromise would be a reasonable way to make progress in the short term.)


Just to be clear: I do recognize that this compromise position would effectively mean delaying pinning down the definition of what "overlap" means, since the whole point of writing out overlap_a vs overlap_b was indeed about formally stating how zero-sized types (and, I think, values in general) should be treated. But I also think it is okay to delay that decision, as long as we e.g. open an issue saying that it still needs to be decided.

@nikomatsakis
Copy link
Contributor

nikomatsakis commented Nov 7, 2024

Thinking on this a bit more, what my message could've made clearer is that we were actively looking for a recommendation here (along with rationale) because it seemed unclear what was best and we recognize others are much closer to the discussion. If there was not consensus amongst those participants (or if we find we don't agree with the rationale) another option is to defer the question.

The motivation then to include the questions for consideration is that these are things that came up in the meeting so if the rationale or decision does not address this then we will likely just have the same questions again.

@workingjubilee
Copy link
Member

What I was responding to was specifically @joshtriplett's comment:

In particular, I don't think it should be possible for a ZST to be in the middle of an unrelated object (e.g. an unrelated array of non-ZSTs).

@nikomatsakis proposed defining overlap as "the start of one object lies between the start and end of the other, exclusive". That would prevent that property.

Which doesn't seem to have a clear motivation behind why preventing that property is desirable, and it sounded pretty definite, at least? Perhaps I am attaching too much finality to "I don't think it should be possible"?

I'm aware I can come across as needling, and I don't particularly enjoy it either. I feel I get into situations like the one that started with this PR often enough, where everyone signs off on something within 2 hours and then suddenly raising my concern and making sure people understand it is immediately on a timer. Then I feel rushed and have to fumble together an explanation before people have Made Up Their Minds.

and honestly @nikomatsakis I'm sorry because I didn't read your message very closely as, seeing @joshtriplett's first, I was already typing and only kinda scanned what you said.

@nikomatsakis
Copy link
Contributor

Thanks for the clarification, @workingjubilee, I appreciate your message. I believe @joshtriplett's comment was meant to convey his personal opinion, but it can be difficult to tell the difference sometimes, and I already said how I felt my message could've been clearer.

I have a few takeaways from this:

  • When writing for the team, be explicit about when the team has a consensus position and when it does not.
  • When writing for oneself, it never hurts to be extra explicit about whether you are speaking for yourself or the team.

In general, I like the pattern of teams like opsem (or types) providing recommendations and rationale, and lang having the role of "double checking" and raising questions. In some sense, I view lang as playing the role of the non-expert, whereas the teams are the experts in the domain. The result has to make sense to both.

@RalfJung
Copy link
Member Author

RalfJung commented Nov 7, 2024

I think going down the route of "an allocation 0 bytes cannot 'overlap' with an existing allocation" is a bad idea, because we would be:
[...] inviting self-contradiction by complicating the model

I don't think so, the definition of "disjointness" that I suggested above is not particularly complicated.

implicitly saying ZSTs cannot actually exist in Rust in a way that is useful, by imposing constraints that say ZSTs can no longer be at certain locations, when programmers actually do rely on the ability to allocate ZSTs at certain locations!

That is the question, isn't it? Do they rely on this? Generally programmers can't allocate things at certain locations, the allocator picks the location. But people writing linker scripts might indeed be choosing locations for their statics.

Let's back up to first principles. Two ranges overlap if and only if there is some value that might appear in both ranges.

No, I don't think that has to be the usual definition of overlap of a range. Ranges are not sets.

I don't know which definition is more widely used in mathematics. @chorman0773 said my definition is used there; do you have a source for that? @digama0 do you have any idea?

To extend this principle to when the ranges might be empty, we need to handle that case and say that they cannot overlap if and only if:

No, my definition already covered the "empty range" case in the intended way. Sorry for not being clear about that.

fn does_overlap(r1: Range<usize>, r2: Range<usize>) -> bool {
    !(r1.end <= r2.start || r2.end <= r1.start)
}

Anyway, I'm also fine with just punting on the question for ZST for now. 🤷 It just pains me a little since I think many of the ways in which people think ZST are special aren't actually anything special, they fall out of the same general principles. For instance, people say things like "two &mut T have different addresses except when T is a ZST"... which is true but (a) unnecessarily complicated, and (b) unnecessarily weak. Instead we can say "two &mut T point to disjoint ranges of memory", and this is (a) stronger (e.g. if T is i32, it tells us they are at least 4 bytes apart), and (b) doesn't need an "except".

But in the interest of documenting what we have consensus on and not closing the door on future possibilites, I can try to adjust the PR to only be about non-ZST statics.

@digama0
Copy link

digama0 commented Nov 8, 2024

No, I don't think that has to be the usual definition of overlap of a range. Ranges are not sets.

I don't know which definition is more widely used in mathematics. @chorman0773 said my definition is used there; do you have a source for that? @digama0 do you have any idea?

I think that the definition of overlap of sets is clearly that they have empty intersection, but I think your definition makes more sense in this context. The difference of course is whether we want to say that [0, 2) overlaps [1, 1), which is false for sets because the latter set is empty, but is true under the endpoint-based definition. Off the top of my head I don't know a formalization of this interval order, I think it's not likely to come up except in CS-like contexts and in that case the definitions are usually tailored to the application anyway.

I recall a related version of this issue: We currently allow ZST's to be magicked up anywhere, such that <*const ()>::dangling(n) produces a valid pointer for any n. But if so, then that implies that we can also create ZST allocations in the middle of other allocations, so it would not be the case that allocations must be disjoint in the interval sense (although they are still disjoint in the set-of-bytes sense).

@RalfJung
Copy link
Member Author

RalfJung commented Nov 8, 2024

I recall a related version of this issue: We currently allow ZST's to be magicked up anywhere, such that <*const ()>::dangling(n) produces a valid pointer for any n. But if so, then that implies that we can also create ZST allocations in the middle of other allocations,

As mentioned above, "maigcking up" a ZST like that does not create a ZST allocation. It just creates a pointer/reference without provenance. Those are not the same thing.


I wonder if we are constrained by LLVM here... @nikic do you know if there will be trouble with LLVM when we have a zero-sized static located "inside" another static (or inside a stack/heap allocation)? When I do x = malloc(10), is LLVM then allowed to optimize static_addr <= x || static_addr >= x+10 to true?

@workingjubilee
Copy link
Member

I went surveying various embedded software implementations using Rust today. Many of them do in fact use patterns that would prefer to be able to place static ZSTs at fairly arbitrary locations (including inside another static) and use those markers to generate slices in the ways we've discussed, because they need to somehow reason about fixed ranges of memory in slice-like manners. Ideally we would have some sort of pattern that enables writing this, and if not by using the setoid-based rule here, then we'd want something else.

One or two do in fact do everything in the most tedious way (entirely asm! and linker script), but not many seem to tolerate slogging through that mire (...they still gotta suffer through the linker script though).

@RalfJung
Copy link
Member Author

RalfJung commented Nov 8, 2024

Note that using the addresses of those statics as markers can be fine, but under no circumstances is it fine to access the same underlying global memory with pointers derived from different static or extern static declarations (except, probably, those with the same link_name).

@nikic
Copy link
Contributor

nikic commented Nov 8, 2024

I went surveying various embedded software implementations using Rust today. Many of them do in fact use patterns that would prefer to be able to place static ZSTs at fairly arbitrary locations (including inside another static) and use those markers to generate slices in the ways we've discussed, because they need to somehow reason about fixed ranges of memory in slice-like manners. Ideally we would have some sort of pattern that enables writing this, and if not by using the setoid-based rule here, then we'd want something else.

One or two do in fact do everything in the most tedious way (entirely asm! and linker script), but not many seem to tolerate slogging through that mire (...they still gotta suffer through the linker script though).

Do the embedded use-cases just need ZST markers that are directly adjacent to other statics (i.e. they share a start/end), or are they nested strictly inside them? If the latter, do you have any example to share?

@nikic
Copy link
Contributor

nikic commented Nov 8, 2024

I wonder if we are constrained by LLVM here... @nikic do you know if there will be trouble with LLVM when we have a zero-sized static located "inside" another static (or inside a stack/heap allocation)? When I do x = malloc(10), is LLVM then allowed to optimize static_addr <= x || static_addr >= x+10 to true?

After some experimentation, I got this somewhat concerning result: https://llvm.godbolt.org/z/6qTvx3qxd

I guess test1 answers your question in terms of what LLVM currently assumes at least, while test0 looks like an outright miscompile to me. (Alive seems to be struggling with zero-sized globals and thinks that everything is UB: https://alive2.llvm.org/ce/z/qFySyd Filed as AliveToolkit/alive2#1109.)

@workingjubilee
Copy link
Member

workingjubilee commented Nov 8, 2024

I can go into some more detail after I get some sleep but a simplified example would look like this, so yes, strict inclusion: https://godbolt.org/z/eE6Wdjvnb

Most of the ones I looked at were from fairly diligent and clued-in folks, so even the cases that wrote code that is probably still UB tbh went to fairly extreme lengths to write contorted code that they clearly are hoping will avoid the notice of an optimizer. And each was contorted in its own unique way. I am offering this simpler example not because it's an exact replica of what their code is like, but an example of what I think they could be doing if we can support this. (...also, because it doesn't involve everyone learning to read linker script so they can discuss it.)

...but uh, that's fairly alarming and we should probably bring this up to Alive.

@RalfJung
Copy link
Member Author

RalfJung commented Nov 8, 2024

After some experimentation, I got this somewhat concerning result: https://llvm.godbolt.org/z/6qTvx3qxd

Could you add some comments that aid in interpreting the results? :)

@nikic
Copy link
Contributor

nikic commented Nov 8, 2024

After some experimentation, I got this somewhat concerning result: https://llvm.godbolt.org/z/6qTvx3qxd

Could you add some comments that aid in interpreting the results? :)

test0: zero-size global == start of alloca -> false
test1: zero-size global == middle of alloca -> false
test2: zero-size global == end of alloca -> unknown

test0 is a miscompile, and test1 depends on whether a zero-size global can be in the middle of an alloca or not. LLVM currently assumes it can't, but given that it also assumes it can't be at the start of one, it's hard to distinguish whether that's a bug or a feature :)

I think if there is a consensus that we really want to allow zero-size statics that arbitrarily overlap with other allocations, we probably could get through a LangRef change to that effect and adjust InstSimplify accordingly.

Though I think in this context, it's also important to distinguish between a) what addresses an extern static may be placed at (e.g. via linker script) and b) what addresses Rust itself can place a static. Even if we allow extern statics to overlap other statics, if you define a static directly in Rust, is Rust allowed to place it at an overlapping address?

@RalfJung
Copy link
Member Author

RalfJung commented Nov 8, 2024

That helps, thanks!

I don't have a strong opinion on whether should allow zero-size statics that arbitrarily overlap with other allocations. If it can be done in LLVM without impacting relevant optimizations, I would generally err on the side of having less UB.

@workingjubilee
Copy link
Member

workingjubilee commented Nov 8, 2024

Though I think in this context, it's also important to distinguish between a) what addresses an extern static may be placed at (e.g. via linker script) and b) what addresses Rust itself can place a static. Even if we allow extern statics to overlap other statics, if you define a static directly in Rust, is Rust allowed to place it at an overlapping address?

My impression is that from the perspective of embedded developers, the artificial distinction we draw between extern "C" static and "native" Rust static is not very strong to them. Many do avoid it, but especially in code that I see when someone is asking for help debugging stuff, they don't bother with the extern "C" linkage. Partly because it works: as the example I show demonstrates, all you need to is #[link_section] (or #[no_mangle], with slightly different tricks) and then you can start picking addresses.

And as I've mentioned, what's more important is having a blessed pattern that we can recommend to this community, rather than telling them what not to do, which mostly leads to them producing obfuscated code:

Ideally we would have some sort of pattern that enables writing this, and if not by using the setoid-based rule here, then we'd want something else.

@scottmcm
Copy link
Member

scottmcm commented Nov 8, 2024

@workingjubilee
Copy link
Member

I have opened rust-lang/unsafe-code-guidelines#546 to try to capture some of the discussion from here about the ZST issue and will try to expand on the embedded use-cases in that issue or the also-relevant rust-lang/unsafe-code-guidelines#545

@RalfJung
Copy link
Member Author

RalfJung commented Nov 9, 2024

@scottmcm okay, so that would place ZST constant allocations at NonNull::dangling addresses. That's already covered by the wording of this PR even without the ZST exception.

Do zero-sized local variables get an alloca, or what do we do about them?

@joshtriplett
Copy link
Member

What I was responding to was specifically @joshtriplett's comment:

In particular, I don't think it should be possible for a ZST to be in the middle of an unrelated object (e.g. an unrelated array of non-ZSTs).
@nikomatsakis proposed defining overlap as "the start of one object lies between the start and end of the other, exclusive". That would prevent that property.

Which doesn't seem to have a clear motivation behind why preventing that property is desirable, and it sounded pretty definite, at least? Perhaps I am attaching too much finality to "I don't think it should be possible"?

I'm aware I can come across as needling, and I don't particularly enjoy it either. I feel I get into situations like the one that started with this PR often enough, where everyone signs off on something within 2 hours and then suddenly raising my concern and making sure people understand it is immediately on a timer. Then I feel rushed and have to fumble together an explanation before people have Made Up Their Minds.

That comment, written during the meeting in the course of discussion, was very much meant to be an early indicator of "this raised some eyebrows in a meeting around what seemed like an unaddressed corner case; here were some various thoughts that came up". It was not meant to convey any kind of definitive conclusion on What Behavior The Team Wants, just to start a discussion. That said, I could absolutely have spelled that out more explicitly. Sorry if it came across as a Decision Being Made rather than a Discussion Being Had.

@digama0
Copy link

digama0 commented Nov 10, 2024

For what it's worth, my answer above was given without having read too much of the backscroll, and now that I have I'm leaning more toward allowing ZST allocations (or "allocations") to be anywhere, including strictly inside another allocation, consistently for all purposes. But for the purpose of this RFC, I'm also fine with just punting on the question for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.