Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Share kind #11781

Closed
brson opened this issue Jan 24, 2014 · 30 comments · Fixed by #12686
Closed

Add Share kind #11781

brson opened this issue Jan 24, 2014 · 30 comments · Fixed by #12686
Labels
E-mentor Call for participation: This issue has a mentor. Use #t-compiler/help on Zulip for discussion. P-medium Medium priority
Milestone

Comments

@brson
Copy link
Contributor

brson commented Jan 24, 2014

The Share kind is a type bound for types that can be safely shared across threads. This is necessary to support fork/join operations while capturing an environment safely. Nominating.

Migration Plan

  1. Add Share kind and replace usages of Freeze with Share in bounds.
    • Keep the NonFreeze markers and rules.
  2. Work on Introduce a Unsafe<T> type to be used as the basis for Cell, RefCell, Atomic, etc and remove Freeze #12577
    • Remove markers
    • Forbid values with unsafe interiors in static items
  3. Remove Freeze
@flaper87
Copy link
Contributor

What's the thought behind replacing Freeze with Share?

IIUC the idea, Share should imply Freeze but not the other way around.

@brson
Copy link
Contributor Author

brson commented Jan 25, 2014

Yes, Share is a superset of Freeze. It's not clear that Freeze continues to have a useful role once we have Share. Is knowing that a type is immutable good for other purposes?

@nikomatsakis
Copy link
Contributor

cc me. I am increasingly of the opinion that Freeze as distinct from
Share may not be necessary. I believe the main reason for it was to
allow Rc to be acyclic.

@thestinger
Copy link
Contributor

@brson: It's not necessary for any semantic reason that I know of, but I was planning on using it as the basis for type-based alias analysis (&T is NoAlias with all other pointers when it is Freeze). The extra burden is the need to mark a thread-safe type with inner mutability. I think this would be a single location inside an unsafe Arc type used to build mutable concurrent data structures.

@flaper87: A MutexArc can be shared across threads but is not Freeze. This applies to any mutable concurrent data structures.

@sanxiyn
Copy link
Member

sanxiyn commented Jan 27, 2014

See also the post Data Parallelism in Rust.

@flaper87
Copy link
Contributor

I'm interested in working on this!

@ghost ghost assigned flaper87 Jan 27, 2014
@brson
Copy link
Contributor Author

brson commented Jan 30, 2014

@flaper87 OK. You've got it. This may be as easy as renaming Freeze and #[no_freeze] to Share and #[no_share] and removing the #[no_share] attributes from types like MutexArc that are now sharable.

@nikomatsakis I am worried that Rust allows some types to be unsafely shared by default. Would making unsafe pointers unsharable by default prevent this?

@pnkfelix
Copy link
Member

Accepted for 1.0, assigning P-high.

@pnkfelix
Copy link
Member

(assigning E-mentor in hopes that @nikomatsakis has spare cycles to act as a mentor here... or someone else could...)

@nikomatsakis
Copy link
Contributor

I will mentor. @brson that's an interesting thought, maybe we can make unsafe sharing opt-in in this way

@brson
Copy link
Contributor Author

brson commented Jan 30, 2014

Share probably interacts in subtle ways with Send. At the least, something not Send can't be put into something Share, though maybe the same rule already applies to Send and Freeze.

@nikomatsakis
Copy link
Contributor

I think we want Freeze to stay. it's not clear to me that Send and Freeze have any inherent relationship. I think the breakdown is:

  • Send (deeply owned)
  • Freeze (deeply immutable if aliased)
  • Share (mutable, but threadsafe even if aliased)

I don't know that any of those three traits have any particular relationship to one another:

  • Freeze does not seem to imply Share: consider GC<T> -- it is Freeze, I think, but it is not threadsafe to access from another thread.
  • Send does not seem to imply Freeze: consider Cell<T>, which owns its contents.
  • Send does not seem to imply Share: consider again Cell<T>.

For many cases, then, multiple bound are probably required:

  • Sending data requires Send
  • ARC wants Send+Share
  • ...

OK, gotta go, maybe more writeup / thought is needed here than I previously thought!

Note: In the past I've considered a fourth builtin bound Fork, which was supposed to be something like "mutable but threadsafe is not aliased". But I'm not sure that this bound is still needed. The primary motivation was to describe the set of closures that can safely be executed as data-parallel jobs. But I think it's better to address this via a combination of the above bounds and #8622 (closure, trait unification). That is, if we had two trait types like trait Fn { fn call(&self, ...) } and trait FnMut { fn call(&mut self, ...) }, then a "parallel fork" API would take &mut FnMut:Share and the "parallel map" API would take &Fn:Share, and I believe that all requirements are met.

@glaebhoerl
Copy link
Contributor

@nikomatsakis Gc<T> would be non-Freeze if we have a moving GC (we might want it non-Freeze either way to keep options open).

While the relationship between them is an interesting question, I think the more significant question is: in what use cases do you want Freeze and not Share?

@nikomatsakis
Copy link
Contributor

On Thu, Feb 20, 2014 at 01:33:27AM -0800, Gábor Lehel wrote:

@nikomatsakis Gc<T> would be non-Freeze if we have a moving GC
(we might want it non-Freeze either way to keep options open).

Hmm, interesting.

While the relationship between them is an interesting question, I
think the more significant question is: in what use cases do you
want Freeze and not Share?

At the moment I can't recall but I feel like I've come up with several
examples recently; let me get back to you.

@glaebhoerl
Copy link
Contributor

#12432 seems like a clear example

@nikomatsakis
Copy link
Contributor

@glaebhoerl ah yes, thanks. I think that's the rather convincing one I was thinking of.

@nikomatsakis
Copy link
Contributor

Editing the title. I believe that the current def'n of Freeze in TypeContents is out of date

@nikomatsakis
Copy link
Contributor

I've been thinking about this more. Based on some recent IRC conversations, I believe the correct meaning of Freeze may be:

A type T is Freeze if, given a variable x of type &'a T, all data reachable from x via x is immutable for the lifetime 'a.

The reason for including the clause "via x" is to allow proc to be freeze. I think a proc is freeze because you can't call it or interact with it from an aliased context; so even if it closers over mutable data, who cares?

But I'm not sure. Maybe the simpler definitions (without the via clause) suffice:

  • Send: Given x: T, x can be safely transferred to another task without leaving any aliases.
  • Freeze: Given x: &T, all data reachable from x is immutable
  • Share: Given x: &T, all data reachable from x is threadsafe

Here threadsafe is defined as "accessible from multiple threads without data races". Also, when I say "all data reachable from x", that's not quite the right phrase. I really mean something like "all operations possible on data reachable from x". Hence Atomic<T> is threadsafe.

@nikomatsakis
Copy link
Contributor

Specifically, Freeze currently excludes &mut T, which is definitely wrong, but proc, which may be wrong, depending on how we choose to define Freeze (see previous comment)

@nikomatsakis
Copy link
Contributor

Actually, it is if it would be valuable to include proc. It's a stretch -- the main idea is "a proc is inert in such a setting, so why not?" Original motivation was so that static immutable values (which must be Freeze) could include enums which have a variant that might have a proc, even if the particular variant used in the static does not. But we realized that applying the rules on static immutable values consistently (i.e., applying the rules to the value not the type) means that such an example is permitted anyhow. So i'm inclined to go with the simpler definition. Of course, given my "all operations possible on data reachable from x" sort of implies "via x" anyway, so maybe either way proc is included.

@nikomatsakis
Copy link
Contributor

Need more formalism ;)

@flaper87
Copy link
Contributor

I agree with @nikomatsakis on pretty much everything he said in the previous 3 comments.

EDIT: Commented here

@glaebhoerl
Copy link
Contributor

Under the expanded definitions of Freeze, would it be possible to make a non-Freeze type using safe code at all, or would it only apply to Cells, atomics, and other types which use unsafe for interior mutability?

@nikomatsakis
Copy link
Contributor

@glaebhoerl I think it is not possible without unsafe code.

@nikomatsakis
Copy link
Contributor

@glaebhoerl though of course anything including a Cell etc would not be freeze, so it's not that user-defined types would never be always be freeze

@glaebhoerl
Copy link
Contributor

of course, and I don't think it has any significance, I was just curious

@nikomatsakis
Copy link
Contributor

I've been thinking this over.

My current opinion is that Freeze and Share ought to mean:

  • Freeze implies that the interior is immutable if aliased.
  • Share implies that all reachable content is threadsafe.

The interior is defined as all content reachable without traversing a pointer.

Threadsafe is defined as "exposing no operation that permits a data race if multiple threads have access to a &T pointer simultaneously". (NB: the type system should guarantee that if you have access to memory via a &T pointer, the only other way to gain access to that memory is through another &T pointer, though this is blocked on #12624.) UPDATE: To be clear, this property does hold today, it's just that maintaining this property and having the APIs I want is blocked on #12624.

In both cases, when we say "immutable" or "threadsafe", we are concerned with the public API that a type exports. In other words, implementing Freeze or Share (presuming opt-in) is a contract between the module that defines the module and its non-descendants.

Therefore, Rc<T> is Freeze (regardless of T). Gc<T> is probably the only pointer type that is not Freeze, since we want to reserve the right to have a moving GC.

Without using unsafe code, all types are Share. When using unsafe code to enable interior mutability, the only way a type can be Share is if either atomics or other suitable barriers are used to establish the happens- before relation and thus prevent a data race.

Motivation for these definitions:

  1. Freeze is intended for use in optimization. It allows us to improve the performance of alias analysis because two &T pointers can be considered non-interfering. It is also used to prevent segfaults related to immutable data being placed in static values. For these purposes, interior guarantees are sufficient.
  2. Share is intended to prevent data races. For these purposes, reachability is the important metric, because we are concerned about all possible operations the user might do with access to that type.

Possible variations

While writing this, I was wondering if perhaps this definition isn't quite right, particularly concerning Freeze. This is current definition is sufficient to help LLVM optimization, but a stronger definition -- such as ownership or reachability? -- might enable the Rust front-end to perform more aggressive optimization. I am not sure, though. I was contemplating things like reordering function calls or pulling them out of loops, but given thread-local data, side-effects, and so on it's unclear if this would ever be safe unless we've analyzed the body of those calls. And if we've done that, then the Freeze type is probably not particularly helpful. So I'm still leaning towards this being the right set of definitions.

@nikomatsakis
Copy link
Contributor

Some further clarifications and notes:

  • One impact of those definitions I gave is that Rc<Cell<int>> is freeze. Perhaps this seems surprising, but the point is that the mutable data is not interior to the type but accessible only via pointer. (Also, as a side note, I can't see any definition of Freeze that permits Rc<int> to be Freeze but not Rc<Cell<int>>, since the ref count is always mutable via user operations.)
  • I'd prefer not to have rules that hinge on ownership but rather to stick to reachability or interior. The reason for that is that as we move to a smart pointer world, ownership becomes...trickier to define and validate. Though to some extent these rules are for humans to consume. But still, where possible I'd like to stick to things that are clearly defined in terms of machine mechanics.

@nikomatsakis
Copy link
Contributor

Created #12683 to cover Freeze. This issue can remain for Shared.

@flaper87
Copy link
Contributor

flaper87 commented Mar 4, 2014

@nikomatsakis I agree with pretty much everything you said in your last comment. I just want to put some emphasis in some of the points that had me thinking when I started working on the Share kind.

Freeze implies that the interior is immutable if aliased

Sounds right. One point though, which probably doesn't conflicts with this definition, is that we may want to consider reachability for Freeze too. For instance, things with private mutable references could be considered Freeze. We may want to do this later, though.

Without using unsafe code, all types are Share

This sounds right to me!

Therefore, Rc<T> is Freeze

This means (as we said earlier on IRC) that Rc<Cell<int>> would be considered Freeze, which sounds right to me too.

@flaper87 flaper87 mentioned this issue Mar 4, 2014
4 tasks
bors added a commit that referenced this issue Mar 20, 2014
`Share` implies that all *reachable* content is *threadsafe*.

Threadsafe is defined as "exposing no operation that permits a data race if multiple threads have access to a &T pointer simultaneously". (NB: the type system should guarantee that if you have access to memory via a &T pointer, the only other way to gain access to that memory is through another &T pointer)...

Fixes #11781
cc #12577 

What this PR will do
================

- [x] Add Share kind and
- [x]  Replace usages of Freeze with Share in bounds.
- [x] Add Unsafe<T> #12577
- [x] Forbid taking the address of a immutable static item with `Unsafe<T>` interior

What's left to do in a separate PR (after the snapshot)?
===========================================

- Remove `Freeze` completely
bors added a commit to rust-lang-ci/rust that referenced this issue Dec 1, 2023
Verify Borrow<T> semantics for types that implement Hash, Borrow<str> and Borrow<[u8]>.

Fixes rust-lang#11710

The essence of the issue is that types that implement Borrow<T> provide a facet or a representation of the underlying type. Under these semantics `hash(a) == hash(a.borrow())`.

This is a problem when a type implements `Borrow<str>`, `Borrow<[u8]>` and Hash, it is expected that the hash of all three types is identical. The problem is that the hash of [u8] is not the same as that of a String, even when the byte reference ([u8]) is derived from `.as_bytes()`

- [x] Followed [lint naming conventions][lint_naming]
- [x] Added passing UI tests (including committed `.stderr` file)
- [x] `cargo test` passes locally
- [x] Executed `cargo dev update_lints`
- [x] Added lint documentation
- [x] Run `cargo dev fmt`

---

 - [x] Explanation of the issue in the code
 - [x] Tests reproducing the issue
 - [x] Lint rule and emission

---

changelog: New lint: [`impl_hash_borrow_with_str_and_bytes`]
[rust-lang#11781](rust-lang/rust-clippy#11781)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
E-mentor Call for participation: This issue has a mentor. Use #t-compiler/help on Zulip for discussion. P-medium Medium priority
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants