Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tracking Issue for once_cell #74465

Closed
9 tasks done
KodrAus opened this issue Jul 18, 2020 · 146 comments
Closed
9 tasks done

Tracking Issue for once_cell #74465

KodrAus opened this issue Jul 18, 2020 · 146 comments
Assignees
Labels
A-concurrency Area: Concurrency B-unstable Blocker: Implemented in the nightly compiler and unstable. C-tracking-issue Category: A tracking issue for an RFC or an unstable feature. Libs-Tracked Libs issues that are tracked on the team's project board. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.

Comments

@KodrAus
Copy link
Contributor

KodrAus commented Jul 18, 2020

This is a tracking issue for the RFC "standard lazy types" (rust-lang/rfcs#2788).
The feature gate for the issue is #![feature(once_cell)].

Unstable API

// core::lazy

pub struct OnceCell<T> { .. }

impl<T> OnceCell<T> {
    pub const fn new() -> OnceCell<T>;
    pub fn get(&self) -> Option<&T>;
    pub fn get_mut(&mut self) -> Option<&mut T>;
    pub fn set(&self, value: T) -> Result<(), T>;
    pub fn get_or_init<F>(&self, f: F) -> &T where F: FnOnce() -> T;
    pub fn get_or_try_init<F, E>(&self, f: F) -> Result<&T, E> where F: FnOnce() -> Result<T, E>;
    pub fn into_inner(self) -> Option<T>;
    pub fn take(&mut self) -> Option<T>;
}
impl<T> From<T> for OnceCell<T>;
impl<T> Default for OnceCell<T>;
impl<T: Clone> Clone for OnceCell<T>;
impl<T: PartialEq> PartialEq for OnceCell<T>;
impl<T: Eq> Eq for OnceCell<T>;
impl<T: fmt::Debug> fmt::Debug for OnceCell<T>;

pub struct Lazy<T, F = fn() -> T> { .. }

impl<T, F> Lazy<T, F> {
    pub const fn new(init: F) -> Lazy<T, F>;
}
impl<T, F: FnOnce() -> T> Lazy<T, F> {
    pub fn force(this: &Lazy<T, F>) -> &T;
}
impl<T: Default> Default for Lazy<T>;
impl<T, F: FnOnce() -> T> Deref for Lazy<T, F>;
impl<T: fmt::Debug, F> fmt::Debug for Lazy<T, F>;

// std::lazy

pub struct SyncOnceCell<T> { .. }

impl<T> SyncOnceCell<T> {
    pub const fn new() -> SyncOnceCell<T>;
    pub fn get(&self) -> Option<&T>;
    pub fn get_mut(&mut self) -> Option<&mut T>;
    pub fn set(&self, value: T) -> Result<(), T>;
    pub fn get_or_init<F>(&self, f: F) -> &T where F: FnOnce() -> T;
    pub fn get_or_try_init<F, E>(&self, f: F) -> Result<&T, E> where F: FnOnce() -> Result<T, E>;
    pub fn into_inner(mut self) -> Option<T>;
    pub fn take(&mut self) -> Option<T>;
    fn is_initialized(&self) -> bool;
    fn initialize<F, E>(&self, f: F) -> Result<(), E> where F: FnOnce() -> Result<T, E>;
    unsafe fn get_unchecked(&self) -> &T;
    unsafe fn get_unchecked_mut(&mut self) -> &mut T;
}
impl<T> From<T> for SyncOnceCell<T>;
impl<T> Default for SyncOnceCell<T>;
impl<T: RefUnwindSafe + UnwindSafe> RefUnwindSafe for SyncOnceCell<T>;
impl<T: UnwindSafe> UnwindSafe for SyncOnceCell<T>;
impl<T: Clone> Clone for SyncOnceCell<T>;
impl<T: PartialEq> PartialEq for SyncOnceCell<T>;
impl<T: Eq> Eq for SyncOnceCell<T>;
unsafe impl<T: Sync + Send> Sync for SyncOnceCell<T>;
unsafe impl<T: Send> Send for SyncOnceCell<T>;
impl<T: fmt::Debug> fmt::Debug for SyncOnceCell<T>;

pub struct SyncLazy<T, F = fn() -> T>;

impl<T, F> SyncLazy<T, F> {
    pub const fn new(f: F) -> SyncLazy<T, F>;
}
impl<T, F: FnOnce() -> T> SyncLazy<T, F> {
    pub fn force(this: &SyncLazy<T, F>) -> &T;
}
impl<T, F: FnOnce() -> T> Deref for SyncLazy<T, F>;
impl<T: Default> Default for SyncLazy<T>;
impl<T, F: UnwindSafe> RefUnwindSafe for SyncLazy<T, F> where SyncOnceCell<T>: RefUnwindSafe;
impl<T, F: UnwindSafe> UnwindSafe for SyncLazy<T, F> where SyncOnceCell<T>: UnwindSafe;
unsafe impl<T, F: Send> Sync for SyncLazy<T, F> where SyncOnceCell<T>: Sync;
impl<T: fmt::Debug, F> fmt::Debug for SyncLazy<T, F>;

Steps

Unresolved Questions

Inlined from #72414:

Implementation history

@KodrAus KodrAus added C-tracking-issue Category: A tracking issue for an RFC or an unstable feature. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. needs-rfc This change is large or controversial enough that it should have an RFC accepted before doing it. labels Jul 18, 2020
@JohnTitor JohnTitor added the B-unstable Blocker: Implemented in the nightly compiler and unstable. label Jul 18, 2020
@matklad
Copy link
Member

matklad commented Jul 24, 2020

Let's cross-out the "should get be blocking?" concern. I decided against this for once_cell, for the following reasons:

  • it's makes Clone, Eq, Debug blocking, which is surprising
  • the original issue that prompted this question used Lazy, and Lazy is immune from this issue, as it always uses blocking get_or_init.

@matklad
Copy link
Member

matklad commented Jul 24, 2020

Added two more open questions from the RFC.

Mark-Simulacrum added a commit to Mark-Simulacrum/rust that referenced this issue Jul 24, 2020
@matklad
Copy link
Member

matklad commented Jul 27, 2020

I've added a summary of proposed API to the issue description.

I wonder if makes sense for @rust-lang/libs to do a sort of "API review" here: this is a pretty big chunk of API, and we tried to avoid bike shedding on the RFC.

@matklad
Copy link
Member

matklad commented Oct 2, 2020

Here's an interesting use-case for non-blocking subset of OnceCell -- building cyclic data structures: https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=4eceeefc224cdcc719962a9a0e1f72fc

@withoutboats
Copy link
Contributor

I strongly expect a method called get to be nonblocking. I am softly in favor of adding a wait API that blocks, but would prefer that it be added in a separate feature later based on demand.

@matklad
Copy link
Member

matklad commented Oct 2, 2020

Yeah, to be clear, there's a consensus that get should be non-blocking, the question is resolved. What is not completely solved in my mind, is where we should have core::lazy::SyncOnceCell. That's possible in theory (by only providing get and set methods), but would be hacky to implement, and of questionable usefulness. The above example is a new use-case for that thing.

@m-ou-se
Copy link
Member

m-ou-se commented Oct 4, 2020

Naming. I'm ok to just roll with the Sync prefix like SyncLazy for now, but have a personal preference for Atomic like AtomicLazy.

I don't think Atomic would be the right word for these. Rust's Atomic types and operations (including Arc) never block and never involve the operating system's scheduler (they're all defined in core or alloc, not std). They're all directly based on the basic atomic operations supported by the processor architecture itself.

I'd expect something that's named AtomicLazy/AtomicOnceCell to do the same. And that's something that already exists as another valid strategy for certain Lazy/OnceCell-like types: Instead of blocking all but one thread when multiple threads encounter an 'empty' cell, it wouldn't block but run the initialization function on each of these threads. The first thread to finish atomically stores its initialized value in the cell, and the others simply drop() the value they created.

The std does something similar in a few places (although not wrapped in a type or publicly exposed). For example, here:

// There is no locking here. It's okay if this is executed by multiple threads in
// parallel. `lookup` will result in the same value, and it's okay if they overwrite
// eachothers result as long as they do so atomically. We don't need any guarantees

And here:

match self.lock.load(Ordering::SeqCst) {
0 => {}
n => return n as *const _,
}
let inner = box Inner { remutex: ReentrantMutex::uninitialized(), held: Cell::new(false) };
inner.remutex.init();
let inner = Box::into_raw(inner);
match self.lock.compare_and_swap(0, inner as usize, Ordering::SeqCst) {
0 => inner,
n => {
Box::from_raw(inner).remutex.destroy();
n as *const _
}
}
}

And another example in parking_lot.

So, since this Lazy/OnceCell implementation does block (such that the initialization function can be FnOnce and the type doesn't have to fit in an atomic), and an alternative purely atomic strategy does exist, I'd really avoid using the word 'atomic' in the name here.

@matklad
Copy link
Member

matklad commented Nov 11, 2020

I've added non-blocking flavors of the primitives to the once_cell crate: https://docs.rs/once_cell/1.5.1/once_cell/race/index.html. They are restricted (can be provided only for atomic types), but are compatible with no_std.

It seems to me that "first one wins" is a better semantics if you can't block, so I am going to resolve Sync no_std subset like this:

  • only std supports sync module, as it requires synchronization (std::thread)
  • if you need something like OnceCell in no-std, your choices are
    • use race module from once_cell with different API, which might or might not be upifted to std some day
    • use some version based on spin locks (this risks @matklad crashing into issue tracker of your project with explanation of how pure spin locks are almost always wrong).

I've ticked this question's box.

@m-ou-se
Copy link
Member

m-ou-se commented Nov 11, 2020

It's a bit of a shame that Lazy uses a fn() -> T by default. With that type, it needlessly stores a function pointer even if it is constant. Would it require big language changes to make it work without storing a function pointer (so, a closure as ZST), while still being as easy to use? Maybe if captureless closures would implement some kind of const Default? And some way to not have to name the full type in statics. That's probably not going to happen very soon, but it'd be a shame if this becomes possible and we can't improve Lazy because the fn() -> T version was already stabilized. Is there another way to do this?

@phil-opp
Copy link
Contributor

@matklad

They are restricted (can be provided only for atomic types), but are compatible with no_std.

This seems like a very major restriction, which rules out most use cases of SyncLazy/SyncOnceCell. So I don't think that this really resolves the sync no_std use case.

I agree that spinlocks have their problems, but they're still better than using static mut instead. I understand that we don't want to hardcode SyncLazy/SyncOnceCell to use a deadlock-prone spinlock on no_std, but maybe it's possible to let the user supply their own implementation of a Mutex/Once primitive?

This could be implemented using a second generic argument on the Sync* types (or maybe even on the Mutex/Once types). This way, users could specify how the synchronization should happen based on their application. A single-threaded embedded application could just disable interrupts for the critical section, a toy OS kernel could use a spinlock, and projects with their own threading system could supply a "proper" synchronization primitive. Maybe I'm missing something, but this seems like a good solution to me.

@m-ou-se
Copy link
Member

m-ou-se commented Nov 11, 2020

Some thoughts about &mut self functions on (Sync)OnceCell:

These types have both &mut self and &self functions, but the &mut interface seems somewhat incomplete, and it's a bit tricky to pick names for overlapping functionality. For example, take can only be done with unique access, so fn take(&mut self) -> Option<T> makes sense. But set can be done on an empty cell through a shared reference, or on a cell in any state through an unique reference. So both fn set(&self, value: T) -> Result<&T, T>; (like Cell::set) and fn set(&mut self, value: T) -> &mut T; (like Option::insert) would make sense.

Maybe if the get_or_insert/get_or_insert_with pair already provides a 'one time set' functionality, set (or insert?) should be the &mut self version instead?

@matklad
Copy link
Member

matklad commented Nov 12, 2020

Unresolved question: method naming

Currently, we have get_or_init and get_or_try_init. Are those good names? Here are some alternatives (see also #78943)

  1. get_or_init, get_or_try_init
  2. get_or_insert_with, try_get_or_insert_with
  3. get_with, try_get_with

1. Pro: Status Quo, name specific to OnceCell (you see x.get_or_init, you know x is one cell). Con: doesn't feel like it perfectly fits with other std names.
2. Pro: matches Option::get_or_inser_with exactly. Con: for OnceCell, unlike Option, this is the core API. It's a shame that its a mouthful.
3. Pro: short, matches std conventions. Con: _with without or suggest that the closure will be always called, but it's not really the case.

I've though more about this, and I think I actually like 3 most. It's Con seems like a Pro to me. In the typical use-case, you only use _with methods:

impl Spam {
  fn get_eggs(&self) -> &Eggs {
    self.eggs.get_with(|| Eggs::cook())
  }
}

So, the closure is sort-of always called, it's just cached. Not sure if I my explanation makes sense, but I do feel that this is different from, eg, Entry::or_insert_with.

@matklad
Copy link
Member

matklad commented Nov 12, 2020

@phil-opp: I think it is rather certain that, even if std provides a subset of OnceCell for no_std, it will be non-blocking subset (set and get).

It certainly is possible to use spinlocks, or make sync::OnceCell parametric (compile-time or run-time) over blocking primitives. I am pretty sure that should be left for crates.io crate though.

I feel one important criterion for inclusion in std is "design space has a solution with a single canonical API". OnceCell API seem canonical. If we add paramters, the design space inflates. Even if some solution would be better, it won't be obviously canonical, and would be better left to crates.io.

@matklad
Copy link
Member

matklad commented Nov 12, 2020

It's a bit of a shame that Lazy uses a fn() -> T by default.

@m-ou-se yeah, totally agree that this is a hack and feels like a hack. It works well enough in practice, but there's one gotcha: specifying type for a local lazy does not work:

let x = 92;
let works1: = Lazy::new(|| x.to_string());
let broken: Lazy<String> = Lazy::new(|| x.to_string());
let works2: Lazy<String, _> =  Lazy::new(|| x.to_string());

The broken variant is something that people occasionally write, and it fails with a somewhat confusing error. If we remove the default type, it will still be broken, but folks won't have intuition that "one parameter should be enough".

One easy way out here is to stabilize only OnceCell, and punt on Lazy for the time being. OnceCell contains all the tricky bit, and Lazy is just some syntactic sugar. For me (and probably for some, but not all, other folks) writing

fn global_state() -> &'static GlobalState {
  static INSTANCE: SyncOnceCell<GlobalState> = SyncOnceCell::new();
  INSTANCE.get_or_init(GlobalState::default)
}

doesn't feel like a deal breaker.I'd prefer that to pulling a 3rd party dep (lazy_staic or once_cell).

That said, I think Lazy's hack is worth stabilizing. Even if in the future we'll be able to write:

static GLOBAL_STATE: Lazy<GlobalState, _> = Lazy::new(GlobalState::default);

I don't see a lot of practical problems with

static GLOBAL_STATE: Lazy<GlobalState> = Lazy::new(GlobalState::default);

working as well.

@nwn
Copy link

nwn commented Nov 12, 2020

Unresolved question: method naming

1. `get_or_init`, `get_or_try_init`

2. `get_or_insert_with`, `try_get_or_insert_with`

3. `get_with`, `try_get_with`

I think 1 is the most appropriate. The init terminology makes more sense than insert in the context of a once cell. Depending on whether we expose a direct value initializer, it may be more consistent to add _with to these methods, though.

I've though more about this, and I think I actually like 3 most. It's Con seems like a Pro to me. In the typical use-case, you only use _with methods:

[...]

So, the closure is sort-of always called, it's just cached. Not sure if I my explanation makes sense, but I do feel that this is different from, eg, Entry::or_insert_with.

This doesn't seem very intuitive to me and isn't always true when there are multiple points of initialization. For example, consider:

impl Spam {
    fn get_eggs(&self, cooked: bool) -> &Eggs {
        if cooked {
            self.eggs.set(Eggs::cook());
        }
        self.eggs.get_with(|| Eggs::raw())
    }
}

In this case, the closure may not run and in fact a different value has been cached. I think get_or_init_with would make this case more clear.

@raphaelcohn
Copy link

Something I've recently got bitten by is the need to manage which memory allocator a memory uses. I've been workign wit ha design that has a different global memory allocator when running threads or coroutines (so restricting a coroutine to a maximum amount of memory). This could be thought of as a bit of a hack; one of the long-term design decisions of early Rust that still bites is not making the memory allocator type explicit in the standard collections.

With a lazy, the challenge becomes ensuring that they're all allocated using the same memory allocator.

@tisonkun
Copy link
Contributor

tisonkun commented Aug 3, 2023

Can we have a get_mut_or_init?

@SimonSapin
Copy link
Contributor

OnceCell::get_mut_or_init would only be safe if you have &mut self. But if you have exclusive access when initializing, do you even need a cell in the first place? Couldn’t you use a plain Option and initialize it with get_or_insert_with?

@cuviper
Copy link
Member

cuviper commented Aug 7, 2023

Can't you say the same about get_mut? But it may be a conditional situation, where sometimes you have the necessary information to initialize while you have exclusive access, and other times you need to do it later while shared.

@tisonkun
Copy link
Contributor

tisonkun commented Aug 8, 2023

OnceCell::get_mut_or_init would only be safe if you have &mut self. But if you have exclusive access when initializing, do you even need a cell in the first place? Couldn’t you use a plain Option and initialize it with get_or_insert_with?

Because not all accessors are get_mut_or_init. Basically, I need a get_or_init, but sometimes I want a mut reference and am unsure if it's initialized.

You can read this code snippet and see if there is a better solution than providing a get_mut_or_init:

// batches: OnceCell<Vec<RecordBatch>>,

    pub fn mut_batches(&mut self) -> IterMut<'_, RecordBatch> {
        self.batches.get_or_init(|| load_batches(&self.buf));
        // SAFETY - init above
        unsafe { self.batches.get_mut().unwrap_unchecked() }.iter_mut()
    }


    pub fn batches(&self) -> Iter<'_, RecordBatch> {
        self.batches.get_or_init(|| load_batches(&self.buf)).iter()
    }

https://github.com/tisonkun/kafka-api/blob/d080ab7e4b57c0ab0182e0b254333f400e616cd2/kafka-api/src/record.rs#L108-L116

@tisonkun
Copy link
Contributor

tisonkun commented Aug 8, 2023

But you're right that without concurrent calls an Option + get_or_insert_with may work.

@tisonkun
Copy link
Contributor

tisonkun commented Aug 14, 2023

No. I need to guard shared non-mut access to batches so I still need a get_mut_or_init.

Otherwise,

    pub fn batches(&self) -> Iter<'_, RecordBatch> {
        self.batches
            .get_or_insert_with(|| load_batches(&self.buf))
            .iter()
    }

failed to compile: Cannot borrow immutable local variable `self.batches` as mutable .

OnceCell impl !Sync and can support such interior mutability

tisonkun added a commit to tisonkun/rust that referenced this issue Aug 14, 2023
ryoqun added a commit to ryoqun/solana that referenced this issue Oct 4, 2023
cargo-test-sbf (governance/addin-mock/program, governance/program):
error[E0658]: use of unstable library feature 'once_cell'
   --> src/abi_example.rs:559:36
     |
 559 | impl<T: AbiExample> AbiExample for std::sync::OnceLock<T> {
     |                                    ^^^^^^^^^^^^^^^^^^^^^^
     |
     = note: see issue #74465 <rust-lang/rust#74465> for more information
     = help: add `#![feature(once_cell)]` to the crate attributes to enable
ryoqun added a commit to solana-labs/solana that referenced this issue Oct 4, 2023
* Define generic AbiExample for OnceLock

* Guard with cfg...

cargo-test-sbf (governance/addin-mock/program, governance/program):
error[E0658]: use of unstable library feature 'once_cell'
   --> src/abi_example.rs:559:36
     |
 559 | impl<T: AbiExample> AbiExample for std::sync::OnceLock<T> {
     |                                    ^^^^^^^^^^^^^^^^^^^^^^
     |
     = note: see issue #74465 <rust-lang/rust#74465> for more information
     = help: add `#![feature(once_cell)]` to the crate attributes to enable
tao-stones pushed a commit to tao-stones/solana that referenced this issue Oct 6, 2023
* Define generic AbiExample for OnceLock

* Guard with cfg...

cargo-test-sbf (governance/addin-mock/program, governance/program):
error[E0658]: use of unstable library feature 'once_cell'
   --> src/abi_example.rs:559:36
     |
 559 | impl<T: AbiExample> AbiExample for std::sync::OnceLock<T> {
     |                                    ^^^^^^^^^^^^^^^^^^^^^^
     |
     = note: see issue #74465 <rust-lang/rust#74465> for more information
     = help: add `#![feature(once_cell)]` to the crate attributes to enable
r-bk added a commit to r-bk/rsdns that referenced this issue Dec 22, 2023
Due to an indirect dependency which fails with the following error on
1.65.0

---
error[E0658]: use of unstable library feature 'once_cell'
   --> /home/runner/.cargo/registry/src/github.com-1ecc6299db9ec823/ignore-0.4.21/src/gitignore.rs:596:9
    |
596 |     use std::sync::OnceLock;
    |         ^^^^^^^^^^^^^^^^^^^
    |
    = note: see issue #74465 <rust-lang/rust#74465> for more information
tisonkun added a commit to tisonkun/rust that referenced this issue Feb 14, 2024
tisonkun added a commit to tisonkun/rust that referenced this issue Feb 14, 2024
tisonkun added a commit to tisonkun/rust that referenced this issue Mar 24, 2024
tisonkun added a commit to tisonkun/rust that referenced this issue Mar 27, 2024
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Apr 6, 2024
impl get_mut_or_init and get_mut_or_try_init for OnceCell and OnceLock

See also rust-lang#74465 (comment)

I'm trying to understand the process for such proposal. And I'll appreciate it if anyone can guide me the next step for consensus or adding tests.
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Apr 6, 2024
impl get_mut_or_init and get_mut_or_try_init for OnceCell and OnceLock

See also rust-lang#74465 (comment)

I'm trying to understand the process for such proposal. And I'll appreciate it if anyone can guide me the next step for consensus or adding tests.
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Apr 6, 2024
impl get_mut_or_init and get_mut_or_try_init for OnceCell and OnceLock

See also rust-lang#74465 (comment)

I'm trying to understand the process for such proposal. And I'll appreciate it if anyone can guide me the next step for consensus or adding tests.
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Apr 6, 2024
impl get_mut_or_init and get_mut_or_try_init for OnceCell and OnceLock

See also rust-lang#74465 (comment)

I'm trying to understand the process for such proposal. And I'll appreciate it if anyone can guide me the next step for consensus or adding tests.
rust-timer added a commit to rust-lang-ci/rust that referenced this issue Apr 6, 2024
Rollup merge of rust-lang#114788 - tisonkun:get_mut_or_init, r=dtolnay

impl get_mut_or_init and get_mut_or_try_init for OnceCell and OnceLock

See also rust-lang#74465 (comment)

I'm trying to understand the process for such proposal. And I'll appreciate it if anyone can guide me the next step for consensus or adding tests.
@sdroege
Copy link
Contributor

sdroege commented Oct 21, 2024

Are there any plans for making OnceLock::get_unchecked() (and I guess the mutable version) public? That's currently the only thing preventing me from moving away from the once_cell crate.

@ChayimFriedman2
Copy link
Contributor

@sdroege You can do `OnceLock::get().unwrap_unchecked().

@sdroege
Copy link
Contributor

sdroege commented Oct 21, 2024

Thanks, the atomic op and branch in get() is indeed optimized away when doing that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-concurrency Area: Concurrency B-unstable Blocker: Implemented in the nightly compiler and unstable. C-tracking-issue Category: A tracking issue for an RFC or an unstable feature. Libs-Tracked Libs issues that are tracked on the team's project board. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests