Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pre-RFC: Add a way to insert into a collection, then return a reference to the inserted item in one operation. #3343

Open
5 tasks
cyqsimon opened this issue Nov 3, 2022 · 8 comments

Comments

@cyqsimon
Copy link

cyqsimon commented Nov 3, 2022

I ran across this issue today when I'm attempting to write a function like this:

#[derive(Clone, Debug, PartialEq, Eq, Hash)]
struct Player {
    // fields omitted
}
struct PlayerList {
    // other fields omitted
    players: HashSet<Player>,
}
impl PlayerList {
    /// Create a new player and insert it into the player list.
    /// Then return a reference to this new player.
    pub fn new_player(
        &mut self,
        // other fields omitted
    ) -> &Player {
        let new = Player {};
        self.players.insert(new);
        todo!("return a reference to the new player")
    }
}

To my surprise, there is no easy way to do this with HashSet, or for that matter, with any collection. I ended up having to write this:

    pub fn new_player(&mut self) -> &Player {
        let new = Player {};
        self.players.insert(new.clone());
        self.players.get(&new).unwrap()
    }

This is obviously bad for a multitude of reasons:

  • Unnecessary Player::clone call
  • Unnecessary Hashset::get call
  • Extra unwrap
  • Generally bad ergonomics

So I asked on the Rust-lang discord guild here, and Yand.rs gave this solution:

use ::hashbrown::HashMap;

fn create_new(s: &mut HashMap<String, ()>) -> &str {
    let new = String::from("…");
    s.raw_entry_mut().from_key(&new).or_insert(new, ()).0
}

... but also had this to say:

Sadly it requires raw_entry_mut(), which is either unstable, or requires hashbrown. And also only manual maps have these, sets don't for some reason


My proposition

Add a family of associated functions that provide this capability to all collections:

impl<T> Vec<T> {
    pub fn insert_get(&mut self, index: usize, value: T) -> &T;
    pub fn push_get(&mut self, value: T) -> &T;
}
impl<T> VecDeque<T> {
    pub fn insert_get(&mut self, index: usize, value: T) -> &T;
    // might be unnecessary, see questions
    pub fn push_front_get(&mut self, value: T) -> &T;
    // might be unnecessary, see questions
    pub fn push_back_get(&mut self, value: T) -> &T;
}
impl<T> LinkedList<T> {
    // might be unnecessary, see questions
    pub fn push_front_get(&mut self, value: T) -> &T;
    // might be unnecessary, see questions
    pub fn push_back_get(&mut self, value: T) -> &T;
}
impl<K, V> HashMap<K, V> {
    pub fn insert_get(&mut self, key: K, value: V) -> (&V, Option<V>);
    // See https://github.com/rust-lang/rust/issues/82766#issuecomment-1301845589
    pub fn insert_vacant_get(&mut self, key: K, value: V) -> (&V, Option<V>);
}
impl<K, V> BTreeMap<K, V> {
    pub fn insert_get(&mut self, key: K, value: V) -> (&V, Option<V>);
    // See https://github.com/rust-lang/rust/issues/82766#issuecomment-1301845589
    pub fn insert_vacant_get(&mut self, key: K, value: V) -> (&V, Option<V>);
}
impl<T> HashSet<T> {
    pub fn replace_get(&mut self, value: T) -> (&T, Option<T>);
}
impl<T> BTreeSet<T> {
    pub fn replace_get(&mut self, value: T) -> (&T, Option<T>);
}
impl<T> BinaryHeap<T> {
    pub fn push_get(&mut self, value: T) -> &T;
}

Potential questions

  • The names are descriptive but not very elegant IMO. Are there better ones?
  • Should there be *_get_mut variants too?
  • VecDeque and LinkedList already have front and back methods. Do they make push_front_get and push_back_get redundant?
  • In the return type of maps and sets, should the order of tuple members be flipped? (i.e. (Option<V>, &V) instead)
  • For maps, should the return type contain a reference to the key as well? (i.e. (&K, &V, Option<V>) or ((&K, &V), Option<V>))
@programmerjake
Copy link
Member

for the types with mutable members (so not sets) imho the returned reference should be a mutable reference, the caller can easily convert to a shared reference if they desire but not the other way around.

@SOF3
Copy link

SOF3 commented Nov 4, 2022

What is here that cannot already be done with the safe entry API (although it's missing for sets)?

let mut set: HashMap<String, ()>;
set.entry(String::from("abc")).insert_entry(()).get()

@cyqsimon
Copy link
Author

cyqsimon commented Nov 4, 2022

for the types with mutable members (so not sets) imho the returned reference should be a mutable reference, the caller can easily convert to a shared reference if they desire but not the other way around.

I certainly see the "why not?" appeal here, and TBH I did consider this while writing the post. Ultimately I think it's a tradeoff between consistency and functionality. It might be surprising for some users to find that:

  • *_get (as opposed to *_get_mut) returns a mutable reference
  • the methods for *Set don't return mutable references when that's the case for all other collections

Ultimately I decided to go for the consistency and just return an immutable reference regardless. We can always add *_get_mut variants in the future if necessary, at least that's my thinking.

But yeah admittedly I am pretty torn on what to do here. I would like to know the various opinions of the broader community.

@cyqsimon
Copy link
Author

cyqsimon commented Nov 4, 2022

What is here that cannot already be done with the safe entry API (although it's missing for sets)?

I'm entirely in agreement here that in the specific case of *Map types (and in the future hopefully, *Set types as well; in fact if I'm not mistaken, it has already landed in hashbrown, but hasn't propagated to the standard library yet), these methods will offer little except for ergonomic improvements over the more general entry API. However for other collection types that don't have an entry API (or even the concept of it), they will be somewhat more tangibly useful IMO.

So again it comes down to a tradeoff with consistency I suppose. It would be nice if this family of methods is universal across all collections; but I also fully admit that this will come with the downside of duplication of functionality. Honestly, I don't know which path forward is better either. After all, that's why this is a pre-RFC, sorry ¯\_(ツ)_/¯

@tbu-
Copy link
Contributor

tbu- commented Nov 12, 2022

The proposed functions are unfortunately not all that practical because they cause the underlying collection to be borrowed mutably, meaning that one cannot call any function or get any other reference out of them.

let new = vec.push_get(123);
// this errors:
println!("{}", vec[1], new);

@cyqsimon
Copy link
Author

The proposed functions are unfortunately not all that practical because they cause the underlying collection to be borrowed mutably, meaning that one cannot call any function or get any other reference out of them.

Yikes. That's a very good point (unfortunately) that I overlooked. I just assumed that since we're only retaining an immutable reference, the borrow checker would see that the collection is no longer being borrowed mutably.

Speaking of which, I'm very curious whether there is actually a safety reason behind this behaviour, or is it just a technical limitation: why doesn't the borrow checker see that the retained reference has been "downgraded"? It seems to me like all the information the borrow checker would need to deduce this is available in the function signature.

@SOF3
Copy link

SOF3 commented Nov 16, 2022

Speaking of which, I'm very curious whether there is actually a safety reason behind this behaviour, or is it just a technical limitation: why doesn't the borrow checker see that the retained reference has been "downgraded"? It seems to me like all the information the borrow checker would need to deduce this is available in the function signature.

A simple counterexample:

fn demut(x: &mut AtomicU32) -> &u32 {
    &*x.get_mut()
}

let mut x = AtomicU32::new(1);
let y = demut(&mut x);
// y is now an immutable &u32 that equals &1
x.store(2, SeqCst);
// is *y now 1 or 2?

@golddranks
Copy link

@cyqsimon As @SOF3 's example shows, the borrow checker can't see that because that would be essentially an API guarantee, not an implementation detail. Besides, there's the "design rule" that the borrow checker can't depend on information from inside of function bodies, as those are considered implementation details too. There has been some ideas of having a "downgrading annotation" (e.g. https://internals.rust-lang.org/t/relaxing-the-borrow-checker-for-fn-mut-self-t/3256/21), but no concrete proposals, and I doubt the time is ripe for such a proposal at the moment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants