Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider reverting the merge of collections into alloc #43112

Closed
brson opened this issue Jul 7, 2017 · 74 comments
Closed

Consider reverting the merge of collections into alloc #43112

brson opened this issue Jul 7, 2017 · 74 comments
Labels
C-bug Category: This is a bug. final-comment-period In the final comment period and will be merged soon unless new substantive objections are raised. P-high High priority T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Milestone

Comments

@brson
Copy link
Contributor

brson commented Jul 7, 2017

This PR merges the collections crate into the alloc crate, with the intent of enabling this PR.

Here are some reasons against the merge:

  • It is a violation of layering / seperation of responsibilities. There is no conceptual reason for collections and allocation to be in the same crate. It seems to have been done to solve a language limitation, for the enablement of a fairly marginal feature. The tradeoff does not seem worth it to me.

  • It forces any no_std projects that want allocation to also take collections with it. There are presumably use cases for wanting allocator support without the Rust collections design (we know the collections are insufficient for some use cases).

  • It gives the std collections crate special capabilities that other collections may not implement themselves - no other collections will be able to achieve feature parity with the conversion this merger is meant to enable.

Personally I value the decomposition of the standard library into individual reusable components and think the merge is moving in the wrong direction.

I am curious to know what embedded and os developers think of this merge cc @japaric @jackpot51 .

@brson brson added this to the 1.20 milestone Jul 7, 2017
@brson brson added P-high High priority T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. labels Jul 7, 2017
@alexcrichton
Copy link
Member

@brson I'm not sure I agree with the reasons you laid out. In my mind liballoc provides the ability to depend on dynamic memory allocation, but nothing else. On top of that one assumption you can build up the entire crate (pointers + collections) with no more assumptions than libcore already has. In that sense I don't see this as a violation of layering but rather just giving you one layer and then everything else that it implies.

It forces any no_std projects that want allocation to also take collections with it.

Is this a problem? Linkers gc'ing sections should take care of this? Isn't this equivalent to libcore bringing utf-8 formatting when you otherwise don't need it?

It gives the std collections crate special capabilities that other collections may not implement themselves

Do you have an example of this?

@japaric
Copy link
Member

japaric commented Jul 7, 2017

I'm in favor of undoing the merge.

In general, I'd prefer not to reduce the number of crates in the std facade. I'm
afraid that these seemingly "small" changes may eventually become hard, if not
impossible, to undo once stable std API depends on these crates being merged
(cf. HashMap confined to std even though it doesn't depend on OS mechanisms).
And that may lock us from creating a set of stable fine grained standard /
blessed crates for no-std development.

In the particular case of alloc / collections it feels odd to have a full suite
of collections available to me when I'm implementing the Allocator trait.
Things are likely to go wrong if I use those collections in the implementation
of my allocator, right? I think allocators are not supposed to do any dynamic
allocation for their internal data structures.

Personally, I'd prefer if alloc only contained the Allocator interface and
Arc was in another crate since it depends on dynamic allocation.

cc @Ericson2314 @whitequark

@alexcrichton
Copy link
Member

If we're going to separate this crate then I think we need to be super clear about why. For example the previous incarnation of alloc I don't think made any sense to keep collections separate because it already had smart pointers and such. If we wanted to remove everything then we should just move the Alloc trait to core in my opinion, not make a whole new crate for just a trait. The purpose of alloc is to provide the assumption of a global allocator, notably the Heap type and value.

Things are likely to go wrong if I use those collections in the implementation
of my allocator, right?

Yes, and to me this is not a reason to separate the crates. This is already a "problem" in the standard library where the "fix" is otherwise too unergonomic to work with.

@est31
Copy link
Member

est31 commented Jul 9, 2017

It gives the std collections crate special capabilities that other collections may not implement themselves - no other collections will be able to achieve feature parity with the conversion this merger is meant to enable.

If cargo had the ability to run with multiple different forks of std see this RFC, this would be much less of an issue.

@brson
Copy link
Contributor Author

brson commented Jul 11, 2017

Is this a problem? Linkers gc'ing sections should take care of this? Isn't this equivalent to libcore bringing utf-8 formatting when you otherwise don't need it?

It is a problem. I do not think depending on linker to throw away unwanted code is a reasonable excuse to make unsatisfactory architectural decisions. If this were an out-of-tree project where people have to spend CPU time compiling code, then telling people to just merge your abstraction layers, compile it all, and throw most of it away later, would not be acceptable.

Do you have an example of this?

Yes. The entire motivation for this merge is to follow it up with this PR, which is seemingly not possible without the collections being literally in the alloc crate. No other collections crate can achieve this.

For example the previous incarnation of alloc I don't think made any sense to keep collections separate because it already had smart pointers and such.

This was indeed a weakness of the previous alloc crate. I'd love to pull the smart pointers out if possible, and would love to not make the conflation worse.

If we wanted to remove everything then we should just move the Alloc trait to core in my opinion, not make a whole new crate for just a trait

I don't think this follows. Each layer in the facade adds new capabilities. Alloc is a crucial capability that separates different classes of Rust software.

@RalfJung
Copy link
Member

I'd love to pull the smart pointers out if possible, and would love to not make the conflation worse.

So merging alloc and collections happened to be able to impl<T> From<Vec<T>> for Arc<[T]>. If instead Arc (and Rc) moved to collections, the impl would also be fine -- right?

@sfackler
Copy link
Member

If the goal is to minimize library code bloat, should we move to libbtree_map, libbtree_set, libvec, libvec_deque, libbinary_heap, etc?

@Ericson2314
Copy link
Contributor

@brson thanks for reopening discussion, and @japaric thanks for the ping.

Personally I value the decomposition of the standard library into individual reusable components and think the merge is moving in the wrong direction.

Exactly. I have a few reasons why I think combining facade crates relating to allocation is bad, but I want to start positively with why I think breaking facde crates further apart in general is good. Writing this with @joshlf's help.

Small, quickly compiled binaries are nice, but more important is the efficiency of the development process. We want a large, diverse, and flexible no-std ecosystem, because ultimately we are targeting a large diverse space of projects. Small libraries are key to this both because they reduce the cognitive load of everyone developing those libraries, and allow more developers to participate.

For cognitive load, it's important both to be utterly ignorant of the code that doesn't matter---as in, don't know what you don't know---and incrementally learn the code that does. --gc-sections or super lazy incremental compilation strip away the noise after the the fact, but do nothing to help the developer reading the code doing the same task. Small libraries help ensure as little dead code as possible, allowing the developer and tools alike to stay happily ignorant of everything else out there. For incremental learning, it's crucial to be able to digest the code one small bite at a time. Rust libraries allow all sorts of intra-crate (inter-module) cycles meaning that given a big huge unknown library, it can be really hard to know where to begin as first-time reader. But fine-grained dependency graphs on crates enforce acyclicity--both a top-down learning approach (start with the nice abstractions) or bottom-up one (complete understanding every step of the way) have clear curricula.

For distributed development and diverse goals, the benefits are probably more obvious. Lighter coupling means less coordination so more people can scratch their own itches in isolation. But it also allows complexity to be managed by splitting the ecosystem into small, modular components - this allows people to take only what they need, and thus only opt-in to as much complexity as their application requires.

All this begs the question---how far do we go down the road of splitting coarse crates into fine crates? I think quite far. It is my hope that as crates behind the facade and the language mature, more crates will be able to be written in stable (or otherwise trustworthy) code, and be moved outside rust-lang/rust into their own repo. Likewise, std should be able to (immediately and transitively) depend on third party crates just fine---the lockfile keeps this safe. Rustbuild, and my RFC #1133 are our friends here. To put it another way, there should be as little magic in rust-lang/rust as possible because magic cannot be replicated outside of rust-lang/rust. By splitting crates, we decrease the risk that a given piece of near-stable code will be bundled with code that cannot be stabilized, thus preventing the former from ever becoming stabilized and thus "de-magicked" in the sense of becoming eligible for being moved outside of rust-lang/rust. This runs counter to the arguments in favor of the collections/alloc merge.


In concrete terms, I recall there being talk of incorporating lazy_static into std, and I know there was talk of using parking_lot for locks too. lazy_static optionally uses spin and so should parking_lot, but spin probably shouldn't be exposed in std because it's a bad idea in hosted environments outside bootstrapping such core abstractions abstractions. Yet it would also weird if spin was included in the main repo as a dependency, but not exposed:I generally like the idea of shrinking, not enlarging the main repo, but would feel especially weird about including non-reexported functionality in the main repo.

Back to allocation in particular, @joshlf's been doing some great work with object allocation (type-parametric allocators that only allocate objects of a particular type and cache initialized objects for performance), and it would be nice to use the default collections with that: the fact is, most collections only need to allocate a few types of objects, and so could work with object allocators just fine. Now if we were to combine alloc and collections and the object allocator traits in one crate, that would be a very unwieldy crate playing not just double but triple duty.


if we wanted to remove everything then we should just move the Alloc trait to core in my opinion, not make a whole new crate for just a trait.

Besides the technical reasons, as @japaric and I mentioned elsewhere, including anything allocation related in core, even something as harmless as a trait that need not be used, will scare away developers of real-time systems.

OTOH, I like the idea of allocator types being usable without Heap. Heap represents "ambient authority " for allocation, to use the capabilities lingo. As a general principle, ambient authority is suspicious, and in particular this complicates safely booting an OS where one doesn't want to allocate accidentally before any allocator is set up.

Also, there's an analogous problematic cycle to avoid when defining the static allocator even if it is initialized statically: For crates which implement global allocators it's incorrect for them to use Heap, yet so long as it's in the same crate as the alloc traits, there's nothing to prevent them from doing so. Technically the language still wouldn't stop them if it's in another crate, but at least they would have to import something suspicious explicitly. This is just like @japaric's concern with allocators accidentally using collections.

CC @eternaleye

@joshlf
Copy link
Contributor

joshlf commented Jul 11, 2017

Given @Ericson2314 's comment, we (again, jointly) would like to make a proposal for how all of these various components might be structured.

We have the following goals:

  • The Alloc trait should have no dependencies other than core. Its dependencies include other items in its own crate such as collections or the Heap.
  • collections should have no dependencies other than core and the Alloc trait (e.g., it should not depend on the Heap); it should be possible for a no-std program to define its own allocator and use it to back the standard collections.
  • On the other hand, it is desirable for ergonomics to provide variants of each of the collections that default to using the Heap.

Thus, we propose the following:

  • The Alloc trait should either be in core or in its own crate (we'll call it alloc here). The former is simpler, but may be off-putting to embedded/real-time developers, and goes against the original philosophy of core - the bare minimum of nice abstractions to define the always-needed parts of the runtime, not every self-contained abstraction under the sun. The latter involves making a tiny crate whose sole purpose is to hold Alloc and related types (although it would become less tiny if an ObjectAlloc trait were added later). We lean towards a separate crate, but either works.
  • collections should be its own crate with dependencies only on core and alloc (if it exists). It should be no-std.
  • Heap can either be defined in its own crate with dependencies only on core and alloc (if it exists) or simply be defined in std.
  • std will re-export all of the collections with a default Alloc parameter of Heap. There may be other default values that std needs to set as well (e.g., the OS-dependent randomness provider for HashMap).

@joshlf
Copy link
Contributor

joshlf commented Jul 11, 2017

@sfackler

If the goal is to minimize library code bloat, should we move to libbtree_map, libbtree_set, libvec, libvec_deque, libbinary_heap, etc?

I don't think that the arguments provided suggest that that should happen. Keeping all of the collections together doesn't increase the difficulty of understanding how things work because, for the most part, collections do not depend on one another, and do not share magic under the hood. From a usability perspective, they're logically related, so it would not be surprising to a developer to find them together in the same crate. The arguments we and others have presented do suggest splitting collections into its own thing - separate from, e.g., Alloc - but do not suggest further splitting collections into multiple crates.

@sfackler
Copy link
Member

All this begs the question---how far do we go down the road of splitting coarse crates into fine crates? I think quite far.

@joshlf ^ seems to imply to me that it would suggest that level of granularity.

@joshlf
Copy link
Contributor

joshlf commented Jul 11, 2017

Ah, we definitely didn't mean to imply that. @Ericson2314 can weigh in when he gets a chance, but speaking for myself, I would interpret that as "quite far within reason." I don't think that our reasoning provides a good argument for splitting collections up into separate crates, although maybe @Ericson2314 will disagree and will think we should even go that far.

@Ericson2314
Copy link
Contributor

Well I... wouldn't fight that level of granularity if many thers want it :).

Trying to think of a principle of why it's more important to separate alloc from collections than each collection from each other, I arrived at a sort of tick-tock model where one crate (tick) adds some new capability, and the next (tok) builds a bunch of with the capabilities added so far (it's "marginally pure"). Crates like alloc or a kernel bindings (e.g
Libc without the other junk), would be be ticks, while collections and the old libsync tocks. I think it's more important that the ticks be minimal than the tocks.

@murarth
Copy link
Contributor

murarth commented Jul 11, 2017

The entire motivation for this merge is to follow it up with this PR, which is seemingly not possible without the collections being literally in the alloc crate. No other collections crate can achieve this.

@brson: Just a minor correction here: The referenced PR doesn't require collections to be within the alloc crate. It only requires Vec to be visible from the alloc crate, in order to implement From<Vec<T>> for Rc<[T]> and Arc<[T]>. Prior to the merge, this was not possible as it would have meant a cyclical dependency.

An out-of-tree collections crate would be able to make the same impls, as it could have both alloc and collections as dependencies.

@joshlf
Copy link
Contributor

joshlf commented Jul 11, 2017

@murarth

Just a minor correction here: The referenced PR doesn't require collections to be within the alloc crate. It only requires Vec to be visible from the alloc crate, in order to implement From<Vec<T>> for Rc<[T]> and Arc<[T]>. Prior to the merge, this was not possible as it would have meant a cyclical dependency.

Presumably the visibility requirement exists because Rc and Arc are defined in alloc? If that's the case, would moving them either to their own crate or into collections solve the problem?

@murarth
Copy link
Contributor

murarth commented Jul 11, 2017

@joshlf: Yes, that's correct.

@joshlf
Copy link
Contributor

joshlf commented Jul 11, 2017

In that case, that'd be my suggestion - to either make a separate box crate (or similar name) or to move them into collections.

@alexcrichton
Copy link
Member

@brson

I do not think depending on linker to throw away unwanted code is a reasonable excuse to make unsatisfactory architectural decisions. If this were an out-of-tree project where people have to spend CPU time compiling code, then telling people to just merge your abstraction layers, compile it all, and throw most of it away later, would not be acceptable.

I don't think I entirely agree with this. The whole premise of this crate is that we're shipping it in binary form so the compilation time doesn't matter too much. We absolutely rely on gc-sections for so many other features I think the ship has long since sailed on making an optional feature of the linker that we invoke.

I think it's also important to keep this all in perspective and extra concrete. On 2017-06-01 liballoc took 0.3s to compile in release mode and libcollections took 3s. Today (2017-07-11) liballoc takes 3.2s to compile in release mode. This is practically nothing compared to crates in the ecosystem.

Yes. The entire motivation for this merge is to follow it up with this PR, which is seemingly not possible without the collections being literally in the alloc crate. No other collections crate can achieve this.

I think this is too quick an interpretation, though. As @murarth pointed out above we're not empowering std collections with extra abilities. Any collection outside std can have these impls.

I don't think this follows. Each layer in the facade adds new capabilities. Alloc is a crucial capability that separates different classes of Rust software.

I think my main point is that we should not automatically go back to what we were doing before. I believe the separation before didn't make sense, and I believe the current separate makes a more sense. If there's a desire to separate the concept of allocation from the default collections that's fine by me, but I don't think we should blindly attempt to preserve what existed previously which wasn't really all that well thought out (the alloc crate looked basically exactly the same as when I first made it ever so long ago)


I'd personally find it quite useful if we stick to concrete suggestions here. The facade is full of subtle intricacies that make seemingly plausible proposals impossible to implement today and only marginally possible in the future.

One alternative is the title of this issue, "Consider reverting the merge of collections into alloc". I have previously stated why I disagree with this.

Another alternative by @joshlf above relies on defining collection types that don't know about Heap. This is not possible today because you can't add default type parameters after-the-fact. This may be possible one day but it is not possible today.

I also further more disagree with rationale that keeps Alloc out of libcore. There's no technical reason that I know of why it can't be in libcore and I'd personally find it a perfect fit for libcore. I can imagine a good number of contexts that want libcore, not libstd, and still use dynamic allocation.

@joshlf
Copy link
Contributor

joshlf commented Jul 11, 2017

@alexcrichton

Another alternative by @joshlf above relies on defining collection types that don't know about Heap. This is not possible today because you can't add default type parameters after-the-fact. This may be possible one day but it is not possible today.

I wasn't thinking that you'd add default type parameters after-the-fact, but rather re-export as a newtype. E.g., in collections:

pub struct Vec<T, A: Alloc> { ... }

And then in std:

use collections::vec;
use heap::Heap;

pub type Vec<T, A: Alloc = Heap> = vec::Vec<T, A>;

I also further more disagree with rationale that keeps Alloc out of libcore. There's no technical reason that I know of why it can't be in libcore and I'd personally find it a perfect fit for libcore. I can imagine a good number of contexts that want libcore, not libstd, and still use dynamic allocation.

That's fine - as we mentioned, keeping Alloc out of core is merely a slight preference, and our proposal still works if Alloc is in core.

@Ericson2314
Copy link
Contributor

Ericson2314 commented Jul 11, 2017

The whole premise of this crate is that we're shipping it in binary form so the compilation time doesn't matter too much.

no-std devs will be recompiling it.

3.2s to compile in release mode

How much longer does it take to build a final binary that depends only on alloc not collections? I suspect that will tell a different story?

As @murarth pointed out above

Huh? The issue is Vec, Arc, and RC need to live in the same crate, but that create need not contain the allocator traits. I'd say we do indeed have a problem and while moving all 3 of those to collections is a good step, there's still a problem because anyone else writing there own vec-like thing runs into the same issue.

I think my main point is that we should not automatically go back to what we were doing before.

I'd personally find it quite useful if we stick to concrete suggestions here.

I think there is some consensus beside you that the first step could be making a smaller alloc than before: with no Rc or Arc, and maybe not Box either? Heap and the Alloc traits would stay in alloc, and then as a second step either the traits would move to core, or heap would move to its own crate.

This is not possible today because you can't add default type parameters after-the-fact. This may be possible one day but it is not possible today.

@joshlf lf beat me to using an alias (or if they fails newtype). CC @gankro cause HashMap and the hasher is 100% analogous.

@Ericson2314
Copy link
Contributor

Ericson2314 commented Jul 11, 2017

I can imagine a good number of contexts that want libcore, not libstd, and still use dynamic allocation.

So can I, but nobody was saying just put it in libstd! libcore <- liballoc <- {libcollections | liballoc-system} <- libstd: I can see any sub-graph (including libcore) being useful.

  • libcore alone: hard real-time
  • libcore, liballoc: soft real time---carefully denying ability to do dynamic allocation to time-critical sections.
  • libcore, liballoc, liballoc-global: Linux kernel module

@alexcrichton
Copy link
Member

@joshlf

I wasn't thinking that you'd add default type parameters after-the-fact, but rather re-export as a newtype. E.g., in collections:

Yes I understand, and I'm re-emphasizing that this does not work today. Whether it will ever work is still up in the air. As a result it's not a viable proposal at this time.


@Ericson2314

How much longer does it take to build a final binary that depends only on alloc not collections? I suspect that will tell a different story?

No, I highly doubt it. Everything in collections is generic which is pay-for-what-you-use.

Huh? The issue is Vec, Arc, and RC need to live in the same crate, but that create need not contain the allocator traits.

This is missing the point. @brson originally though that by moving Vec to liballoc we're empowering the Vec type with a capability not available to any other collection. This is not correct, the only problem was that the Box type could not reference the Vec name, due to how we organized the crates. This point has nothing to do with the allocator traits.

there's still a problem because anyone else writing there own vec-like thing runs into the same issue.

Can you articulate precisely what you think this problem is?

I think there is some consensus beside you that the first step could be making a smaller alloc than before: with no Rc or Arc, and maybe not Box either?

I disagree with two things here. I don't think it makes sense to couple Heap and Alloc. The Alloc trait is a generic form of allocation, assuming nothing. The Heap type assumes there is a global allocator available and that's quite significant. The second part I disagree with is a "smaller alloc", which I think should be called core::heap. The core::heap module would not contain the Heap type.

@joshlf lf beat me to using an alias (or if they fails newtype). CC @gankro cause HashMap and the hasher is 100% analogous.

Again, this is not possible. I was the one that put HashMap in std because it wouldn't fit in collections. We have not solved this problem, this is not a viable solution today. Whether it's a possibility in some future compiler needs to be explicitly accounted for as it means we cannot take action at this time.

@whitequark
Copy link
Member

whitequark commented Jul 11, 2017

Let's consider this from another point of view. I see the current crate hierarchy as follows:

  • libcore: no allocation, no I/O
  • liballoc: infallible allocation, no I/O
  • libstd: infallible allocation, hosted I/O

As a no-std/embedded developer, I do not see any practical use in having what's currently in liballoc split into any number of crates. It is permeated by infallibility on every level, from Box to BTreeMap (and we don't even have HashMap on no-std in the first place...) If I have to replace parts of it, I'm going to have to start from Box, and go downhill from there. Otherwise, there's no reason to use only alloc but not collections; Vec is as fundamental as Box.

The savings in terms of code size do not exist because the majority of embedded software is compiled with LTO and at least opt-level 1 even for debugging; without LTO, libcore alone would blow through the entire storage budget, and without opt-level 1, the generated code is all of: too large, too slow, and too hard to read at that special moment you have to read assembly listings.

It seems straightforward and obviously good to put the Alloc trait into libcore, since there is no cost for doing so and then it can become a common interface for any allocators people implement.

@joshlf
Copy link
Contributor

joshlf commented Jul 11, 2017

It seems straightforward and obviously good to put the Alloc trait into libcore, since there is no cost for doing so and then it can become a common interface for any allocators people implement.

If you, @whitequark, as an embedded developer, don't mind putting the trait in libcore, then I think your opinion overrides mine and @Ericson2314's since we aren't embedded devs :)

@whitequark
Copy link
Member

@joshlf Mostly it's that I don't really understand the argument for keeping it out. The argument for having separate libcore and liballoc goes: libcore's only dependency in terms of linking is libcompiler_builtins, and it introduces no global entities, whereas liballoc introduces linking dependencies to the __rust_allocate, etc, symbols, as well as such global entities as an OOM handler. This presents clear practical argument for keeping them separate.

A trait that isn't used has no cost.

@Ericson2314
Copy link
Contributor

Ericson2314 commented Jul 11, 2017

@alexcrichton

Everything in collections is generic which is pay-for-what-you-use.

Right; my apologies; I forgot about that.

Can you articulate precisely what you think this problem is?

I thought it was a coherence issue. If it's a name-reachability issue with Box, that still counts: the ability to be named in the implementation of Box is a special ability only available to types in the same crate as Box or an upstream crate.

We have not solved this problem, this is not a viable solution today.

Ah there is some naming confusing here because @joshlf used a type alias. A wrapper struct is what I consider a newtype, and that would work, right?. Wrapping every inherent method or using a one-off trait is annoying, but works.

@tarcieri
Copy link
Contributor

This is perhaps getting a bit off topic, but what I really want is Box, String, and Vec added to the prelude if an #[allocator] is defined:

https://internals.rust-lang.org/t/sgx-target-and-no-operating-system-x86-x86-64-targets-in-general/5493/5?u=bascule

@Ericson2314
Copy link
Contributor

@tarcieri the best solution to that is custom preludes.

@tarcieri
Copy link
Contributor

tarcieri commented Jul 18, 2017

@Ericson2314 custom preludes don't really solve the problem I'd like to solve, which is allowing crates to leverage Box, Vec, and String from no_std without explicit dependencies on nightly. Since custom preludes are scoped to individual crates, every crate that wants to consume Box, Vec, and/or String still needs to do #[feature(alloc)] (or if this change goes through, #[feature(collections)] again)

The whole point of automatically adding them to a core/alloc prelude would be to remove this explicit dependency. Then use of Box, Vec, and/or String could just be gated on cargo features with no explicit ties to nightly.

@Ericson2314
Copy link
Contributor

Ericson2314 commented Jul 18, 2017

@tarcieri I'm not sure what to tell you, sorry. Some insta-stable trick to expose items through abnormal means based on whether anything implements allocator anywhere is....unwise in my book. I'd say stabilize the crates quicker, but @whitequark bring up a good point that our current story around handling allocation failure in everything but the allocator trait itself is atrocious: unergonomic and unsafe. I'm loath to stabilize the "beneath the facade" interfaces until that is fixed.

@whitequark
Copy link
Member

unergonomic and unsafe

What? That's the exact opposite of reality. It is safe (because crashes are safe), and it's ergonomic, because explicicly handling allocation failures in typical server, desktop, or even hosted embedded software has a high cost/benefit ratio.

Consider this. With mandatory explicit representation of allocation failure, almost every function that returns or mutates a Vec, Box or String would have to return Result<X>, where X is an error representing allocation failure. The caller then has two choices, both of them bad:

  • either call unwrap, or
  • use ? and return Result<Error>, which may fail because Error has a Box in it!
    If the crate defines its own error type, then that can contain OutOfMemory, but this still pessimizes standalone impls. We'll also need to add OutOfMemory to io::ErrorKind.

Also, let's say you have a public function that returns T. You want to optimize it by using a hashmap or a cache or something, but that means it'd now need to return Result<T>, and that breaks your public interface. So you use unwrap(), resulting in essentially the same thing as what we have today, but with ugly source code (don't we all say that unwrap is an antipattern?) and bloated machine code (because of the inlined condition).

The only remotely workable solution I see is extending the API of Box, Vec, etc, adding a fallible sibling to every function that may allocate. But I believe the libs policy is explicitly against that.

@whitequark
Copy link
Member

To add to this, the majority of small embedded devices whose memory is truly scarce cope in one of the two ways:

  • they partition memory statically or early in boot, so that any fallible allocations, e.g. network packet buffers, happen in pools and can be easily handled, or
  • they acknowledge that recovering from an out-of-memory condition is a complicated affair--you have to have some logic for what to free, you may just have memory too fragmented, etc--and instead, reset, just like in a case of a panic resulting from a violated invariant, an assertion, a lockup, a cosmic ray hitting the CPU core, etc.

As such, I feel like the primary structure providing fallible allocations would be some sort of memory pool. This can be easily handled outside the alloc crate.

@Gankra
Copy link
Contributor

Gankra commented Jul 18, 2017

As a former libs team member, I'm not opposed to adding try_push, try_reserve, etc to the stdlib at this point in time. Someone just needs to put in the legwork to design the API, which I think was partially blocked on landing allocator APIs -- to provide guidance on what allocation error types are like -- when this first came up.

I believe the gecko team broadly wants these functions, as there are some allocations (often user-controlled, like images) which are relatively easy and useful to make fallible.

@Ericson2314
Copy link
Contributor

Ericson2314 commented Jul 18, 2017

@whitequark Sorry, I meant handling allocation in the client is unergonomic/unsafe. Everyone agrees that what we do today is both safe and ergonomic, but not flexible in that regard.

@whitequark
Copy link
Member

As a former libs team member, I'm not opposed to adding try_push, try_reserve, etc to the stdlib at this point in time.

Then this sounds like the best solution to me, yeah.

@brson
Copy link
Contributor Author

brson commented Jul 19, 2017

This thread has gone in a lot of different directions, and I'm having
a hard time following it now, but it is originally about how to
respond to the out-of-process merge of the collections and alloc
crate.

I'm afraid I framed this incorrectly by presenting a defense of the
unmerged crates, but the onus here is on proponents of merged
collections and alloc crates to justify that design, and to do it in a
way that is fair to the rest of us. I don't believe that argument has
been made successfully.

This was a major breaking architectural decision, done out of process.
In cases like this our default stance should be, and usually is, to
revert. This needs to be done before the change hits stable. We have
about 6 weeks right now.

I suggest we take the following actions:

  • Immediately un-deprecate the collections crate to stem the downstream churn

  • Revert the merge of collections and alloc

  • Implement Consider reverting the merge of collections into alloc #43112 some other way. It has been said that it can be
    done without the merge, so please do it.

  • Proponents of changing the std facade write an RFC

@tarcieri
Copy link
Contributor

tarcieri commented Jul 19, 2017

My gut feeling is there are some complex interrelationships between these types which are difficult to model given the previous alloc/collections split. I think that was the motivation for the merge in the first place.

As an example, at the error-chain meeting we discussed moving std::error::Error to liballoc. My gut feeling (and it seems at least @whitequark agrees) is that it belongs in libcore. However, because blanket impls for Box required knowing that &str: !Error (or so says the comments in error.rs), it's coupled to Box, which I guess was the motivation to moving it to std in the first place.

This means any crates that want to work with no_std can't use std::error::Error, which unfortunately is a big blocker for making much of the extant crate ecosystem available in no_std.

I guess my question is: is there still a path forward for unlocking std::error::Error in no_std if this merge is reverted? Could std::error::Error be moved to libcollections instead of liballoc? (which feels even weirder, but I'm not sure there are other options due to its entanglement with Box)

@Ericson2314
Copy link
Contributor

@tarcieri with or without the revert, it can go in alloc. If Box is moved to collections, it can go in there. [Based on http://aturon.github.io/blog/2017/04/24/negative-chalk/, I now think we actually will be able to get negative impls (but not necessarily user-writable negative bounds). Then it can be in core. But I digress.]

@brson Ah, I wasn't actually sure what the process is for changing crates whose very existence is unstable. Thanks for clarifying.

@tarcieri
Copy link
Contributor

My apologies if that's all orthogonal to the revert. If that's the case, I have no objections.

@aturon
Copy link
Member

aturon commented Jul 19, 2017

A couple of process points:

  • I don't know that this unstable arrangement of crates needed the full RFC process, but it should've gotten buy-in from more stakeholders, clearly.

  • That said, my feeling is that this has landed on nightly, it's already caused churn, and it would be better to see if we can quickly reach a rough consensus on the direction we want to take things before changing direction. We should try to minimize pain and confusion in the ecosystem while we try to correct our process mistake.

From the libs team meeting, it sounded like @alexcrichton was quite confident that the original crate organization was buying us nothing, and that the problems people are interested in solving are better addressed in some other way. I think it'd be good for @alexcrichton and @brson to sync up, and then summarize their joint perspective on thread, before we make more changes here.

@shepmaster
Copy link
Member

There doesn't seem to have been any progress on this in the last 3 weeks; and PR #42565 is blocked on this resolving one way or the other. What steps do we need to take to unstick this?

"Watch me hit this beehive with a big ol' stick", he said, pressing Comment

@tarcieri
Copy link
Contributor

#42565 was the sort of thing I was alluding to earlier. Is there a path forward for that if the merger were to be reverted?

@RalfJung
Copy link
Member

#42565 was the sort of thing I was alluding to earlier. Is there a path forward for that if the merger were to be reverted?

Yes -- move just Arc and Rc to libcollections. As long as Arc, Rc and Vec are all in the same crate, the orphan impl check (or whatever we call it in Rust ;) should go through.

@aturon
Copy link
Member

aturon commented Aug 11, 2017

I've re-read this entire thread, and to me the key points are:

Clarity of layering. I think @whitequark put it best:

  • libcore: no allocation, no I/O
  • liballoc: infallible allocation, no I/O
  • libstd: infallible allocation, hosted I/O

though we are working toward supporting fallible allocation in liballoc as well. But the point is that there's a crystal clear rationale for this coarse-grained organization, in terms of requirements imposed.

Crates providing features that aren't used. It's true that the crates.io ecosystem in general skews toward small crates, but these core library crates are a very important exception to that. The thread has made clear that bloat isn't an issue (due to linkers), nor is compilation time (which is trivial here, due to generics).

Special treatment of core crates. Grouping type and/or trait definitions together can enable impls that are not possible externally. However, (1) the very same issues apply to any choice of breaking up crates and (2) the standard library already makes substantial and vital use of its ability to provide impls locally.

Separating collections from the global heap assumption. There is not currently a plausible path to do this, but with some language additions there may eventually be. But by the same token, this kind of question also seems amenable to a feature flag treatment.

Personally, I find the new organization has a much more clear rationale than the current one, and is simpler as well. Stakeholders from impacted communities (the no_std and embedded communities) also appear to be on board. I think we should stay the course.

@rfcbot fcp close

@rfcbot
Copy link

rfcbot commented Aug 12, 2017

Team member @aturon has proposed to close this. The next step is review by the rest of the tagged teams:

No concerns currently listed.

Once these reviewers reach consensus, this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

@rfcbot rfcbot added the proposed-final-comment-period Proposed to merge/close by relevant subteam, see T-<team> label. Will enter FCP once signed off. label Aug 12, 2017
@Ericson2314
Copy link
Contributor

Bummer. Then I suppose the next fronts for pushing for modularity are:

@rfcbot rfcbot added final-comment-period In the final comment period and will be merged soon unless new substantive objections are raised. and removed proposed-final-comment-period Proposed to merge/close by relevant subteam, see T-<team> label. Will enter FCP once signed off. labels Aug 22, 2017
@rfcbot
Copy link

rfcbot commented Aug 22, 2017

🔔 This is now entering its final comment period, as per the review above. 🔔

@jackpot51
Copy link
Contributor

jackpot51 commented Aug 22, 2017

Please don't revert it. This change has been out in the wild for a long time, and many projects have updated to the new organization of alloc. The new organization has benefits as listed above (#40475), and the downsides seem to be smaller than the benefits, especially considering that us folks tracking the nightly, and using the allocation API, have already adjusted to it.

@aturon
Copy link
Member

aturon commented Aug 22, 2017

I'm going to close this issue now that the full libs team has signed off. (@jackpot51, to be clear, that means: we will not revert.)

@aturon aturon closed this as completed Aug 22, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-bug Category: This is a bug. final-comment-period In the final comment period and will be merged soon unless new substantive objections are raised. P-high High priority T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests