-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improving Entry API to get the keys back when they are unused #690
Comments
I still agree with this design, but there's some concern that it is optimizing for the general case, and not the common case. |
Another common scenario would be to insert an entry if absent and return an immutable reference to the value. Currently Of course, it is easy enough to convert a mutable reference to an immutable one: fn get_cache_entry(&self, key: &CacheKey) -> &CacheValue {
match self.cache.entry(key) {
Entry::Occupied(e) => &*e.into_mut(),
Entry::Vacant(e) => {
let cache_value = CacheValue { ... };
&*e.insert(cache_value)
}
}
} But I'd argue that this is too common a use case to ignore. |
So I've thought about this more. I can't think of any example where you'd "want the key back" that wouldn't be equivalent to "I want to clone on insert". I suppose clone-on-insert incurs one extra clone (the last one is potentially unnecessary), but it's unlikely that you'd be able to do avoid this in practice. Also clone-on-insert will likely be more memory effecient, as exactly the desired capacity for the elements in the Vec/String will be used, as opposed to just pushing in the buffer as-is. That brings us back to the IntoOwned/IntoCow design that we tried to land awhile back but had to revert because the by-value case is too important. As such I'm reasonably confident that it will be satisfactory when we have the design where you'll be able to lookup with either a reference or a value, and if you pass in a reference it will be cloned on insertion. @matthieu-m does that sound good to you? CC @aturon |
The IntoOwned would indeed work for most cases I think. It would perform exactly 1 more memory allocation IF the buffer is consumed in the last iteration while having the same number of allocations in all other cases. Furthermore, as you mention, it would allow tailoring the allocation of the keys to exactly what is needed and have a single (largish but short-lived) buffer. So I agree with you that IntoOwned sounds like a preferable option if we can make it work. |
@gankro actually, I am wondering if IntoOwned is going to be that easy. For efficiency, the slice should only become a String (with IntoOwned) if insertion is necessary. At the extreme, it means Vacant should contain a slice, though we might argue that if you use the Entry API then you wanted to insert if it is missing (otherwise you would just use a search) and therefore it would be okay for Vacant to contain the String already. In any case, though, if the Entry API returns a Occupied then no String should have been allocated. Do we agree up until there? Now, if we agree on this, how do we guarantee the coherence of the hashing operation? The simplest thing would be to transform the slice to a String with IntoOwned and hash the String, but then you might as well let the caller perform the transformation (as today). Another alternative would be to actually do the reverse operation, transform the String into a slice before hashing (Deref?), however the benefit of IntoOwned is that any type that implements IntoOwned can be used efficiently, so we cannot so constrain the HashMap. In the end, it seems that with the IntoOwned we will have to rely on the caller guaranteeing that both slice and the String it transforms into produce the same hash value (and can be compared with ==, but we can ask for a Eq bound here). We could double-check it by hashing the String once we have it built and panic in case of issue, but again that's slightly wasteful (maybe only in Debug) and of course it comes "too late". On the other hand, the idea of handing the key back might not be as general (forces the user to handle the buffer on her own), but at least it does not have hash coherence issues. |
IntoOwned implies BorrowFrom, which we already pervasively assume implies Hash/Eq/Ord are equivalent for. |
is there any news on this ? Another use-case that I stumbled upon is implementing an |
Is this, or a different solution still on the roadmap? |
(This should be moved to ACP, if still valid?)
If you want to get back an owned key originally from the collection there is now There is no method to get back the owned key provided to the For HashMap, if you need to retain the key's ownership the current solution is to use match map.raw_entry_mut().from_key(&key) {
RawEntryMut::Occupied(o) => {
// we have owned access to the provided `key` here. do whatever you like.
drop(key);
..
},
RawEntryMut::Vacant(v) => ..,
}; (The
We now have |
Issue by matthieu-m
Saturday Jan 17, 2015 at 15:09 GMT
For earlier discussion, see rust-lang/rust#21298
This issue was labelled with: A-collections in the Rust repository
First of all, I would like to congratulate Gankro on the Entry API, it really simplified the interface of associative containers.
If I had one nit, however, it is that the
entry
methods available always consume the key.Now, said key is taken by value so of course it is consumed, and the key is actually inserted in the container via the call to
VacantEntry::insert
, so it seems reasonable. However, in the case the key slot in the container is already occupied, and we get anOccupiedEntry
as a result, there seems to be little reason not to return the key to the user.For completeness, one could actually also improve
VacantEntry
so that is possible for it to hand over the key it had consumed, although why would one use the Entry API for look-up without any wish to insert is for now a mystery to me.The main benefit of returning the un-consumed key is allowing reusing it. When the key is a simple integer, this has little appeal, however when the key owns heap-allocated memory (such as
String
) then the ability to re-use the buffer rather than continuously allocating/deallocating seems like a huge boon.I prototyped the idea of an improved
Occupied
below, the modification is simple, and casual perusal of http://doc.rust-lang.org/src/std/collections/hash/map.rs.html#1150-1195 seems to show that simply modifying the lines 1175-1177 to:would be the only change to
HashMap
implementation; likewise the entry change at lines 1366-1371 is simple enough:I suspect, but have not verified, that other associative containers would likely be as easily modified. And of course, it would impact all consumers...
The example below show cases achieving buffer re-use through such an improved API, note how easy it is.
The text was updated successfully, but these errors were encountered: