-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Trait objects for multiple traits #2035
Comments
Just dealt with a really frustrating workaround for the lack of upcasting, I'd be interested in working on this. A few questions/comments (retrospectively, maybe more than a few): It looks like the best way to start out with this is super trait coercion, and then move on to arbitrary combined traits. The first looks more or less straight forward, and the second still has some unanswered tradeoff questions, but I do think it's good to keep the second in mind so that casting is implementing in a way that can eventually segue into multiple-trait pointers. For super traits, it seems that we could for the most part just concatenate vtables together. However, consider this example: trait A: Debug + Display { /* pretend there are methods here */ }
trait B: Debug + Display { /* same here */ }
trait C: A + B {} What does the vtable for As an aside, there is a possible optimization around the case of Which brings up another question - going from The only really feasible option I can think of for this is to have the compiler trace which types have a potential path to any of those combinations, and only generate extra tables for those types. Not familiar at all with internals, so I don't know how much work that would be. For reference, here's how C++ does virtual inheritance. The problem is that virtual inheritance in C++ is not nearly as common as multiple trait implementation in Rust, so the overhead of writing all the extra tables and offsets isn't as much. On the other hand, most of the common std traits are just one or two methods, so I don't know if it would even help to do it the C++ way with offset tables, and C++ doesn't have a way to pass a pointer to two classes, you just have to use |
Is there any reason you can't have one vtable for A + B + C and an entirely different vtable for a + c? The benefits of reusing the same vtable if possible are clear - but in cases where that's not possible, what's stopping the compiler from generating a new one for just A + C? |
I just ran my head into this, and it's left me pretty disappointed. I'm developing the rust_dct crate. Each of the Discrete Cosine transforms type 1 through for has its own trait (DCT1, DCT2, DCT3, DCT4), and same with discrete sine transforms (DST1, DST2, DST3, DST4). Previously the structs that implemented these traits were completely disjoint: There's a struct that converts DCT3 problems into FFT problems, and an entirely separate struct that converts DCT2 problems into FFT problems. They're completely separated. But recently, I discovered that DCT2, DST2, DCT3, and DST3 problems pretty much always require the same precomputed data and pre-allocated buffers, and so I've started creating single structs that can compute all four. So a single "convert to FFT" structs all four traits: DCT2, DST2, DCT3, and DST3. All D{C,S}{2,3} structs are now implemented this way. Sometimes, I have a problem that needs both a DCT2 and a DCT3. Currently, I have to write this:
If I had a problem that needed DCT2, DST2, DCT3, and DST3, I'd have to be even more verbose:
I absolutely despise this API though, because it requires an unreasonable amount of repetition by the user, and in the end all four
And then I can use the same
wouldn't it be ridiculously convenient if I could just pass the dct object along and let the compiler figure it out?
|
Mainly the fact that that requires a number of vtables which is exponential in the number of
I agree this should 'just work' one way or another. But as a workaround, for the record, rather than taking separate Alternately… do you actually need to be using trait objects in the first place? Just from a glance at your use case, my guess is that having |
It's definitely occurred to me that my situation works as expected if this is all done at compile time instead of with trait objects. To explain why trait object are more or less necessary here, let me provide two more bits of context:
|
This is a good idea and I think a bunch of people would like to see this implemented. Does anyone want to have a crack at an RFC? (I don't feel too expert about it myself.) |
I was reading accepted RFC 1733 — trait aliases and came to the erroneous conclusion that it would supersede this RFC:
Only careful reading of the RFC's examples showed me this was not the case: trait PrintableIterator = Iterator<Item=i32> + Display;
fn bar3(x: Box<PrintableIterator>) { ... } // ERROR: too many traits (*) There's even an example immediately after that that makes it look like this would work, but only because one of the magic auto traits was renamed: trait Sink = Sync;
trait IntIterator = Iterator<Item=i32>;
fn bar4(x: Box<IntIterator + Sink + 'static>) { ... } // ok (*) Anyway, my point is that I think that RFC 1733 is going to exacerbate the occurrences of this issue. |
This is a complication which doesn't have to be solved immediately — the compiler can simply state that up-casting to multi-trait objects is not (currently) supported. As a workaround users can use What should be supported is:
Issue: name conflict where Issue: if Is this enough of an RFC? It doesn't detail what vtables should look like, but this is probably best left unspecified (I don't see any further complications). |
This behavior was split into two functions so that one of the 5 places we're calling it from could intercept a 404 response code. Since the happy path otherwise is just calling `.json`, this felt really awkward to me. Instead I've opted to return `NotFound`, and downcast to it in the error path for that one place. I expected to just be able to call `Any::is`, but it turns out this method is an inherent method not a trait method (which makes sense, otherwise `Any` wouldn't be object safe). However, a side effect of that is that even though `Any` is a supertrait of `CargoError`, we can't call `Any::is` since `dyn CargoError` can't be cast to `dyn Any`. This may change at some point in the future (see rust-lang/rfcs#2035), but we have to duplicate the body of `Any::is` for now. Except we can't even just duplicate the body of `Any::is`, because the only trait method for `Any` is unstable so we have to duplicate that method, too.......... Other notable changes: - We're no longer sending `Proxy-Connection: Keep-Alive`. According to Wikipedia, this is "Implemented as a misunderstanding of the HTTP specifications. Common because of mistakes in implementations of early HTTP versions". Firefox recently removed it, as have recent versions of curl. - We will accept gzipped responses now. This is good. - We send an actual user agent, instead of "hello!"
Going back to @parkovski's example: can we not use a reduced vtable for C (basically just A plus unique methods from B), then use a lookup table to replace one vtable with another to support This implies that trait-object-casting functions may need a small dataset (of pointers to vtables) embedded in the executable and that trait-object-cast may be slow, but I don't think those are real problems? |
The hard question is what happens if you want to cast For more discussion of those trade-offs see https://internals.rust-lang.org/t/wheres-the-catch-with-box-read-write/6617 -- there is one approach in there (vorner's) that side-steps the aforementioned problems but instead sacrifices efficency of some virtual calls. |
I opened a discussion at https://internals.rust-lang.org/t/casting-families-for-fast-dynamic-trait-object-casts-and-multi-trait-objects/12195 that should address the issues with layering compilation units. |
I figured I'd mention since I haven't seen any discussion of it here, that there is a minimal version of this that is probably much more trivial to add which is adding support just for user-defined marker traits. Because marker traits purely exist at the type level and shouldn't (AFAIK) require any vtables. I ran into wanting this due to trying to make an object safe wrapper trait for an existing non-object safe trait. Anywhere the original trait used |
@jgarvin Good comment! Makes me think that, rather than EDIT: Come to think of it, the "trait addition ok" is probably an |
@scottmcm I believe it is currently an auto trait behavior because the error message you get when you try |
It feels weird to me that this doesn't compile:
But this does:
Will this fix that? |
@SOF3 I'd assume for the same reason only closures allow type inference in function signatures. It'd be too easy to conflate interface and implementation. |
A trait could add new methods with default implementations without breaking semver compatibility. |
I scrolled quite a bit through the backlog, and still don't understand one point: Why can't we automatically desugar |
@piegamesde If that is all you need, you can just declare that trait, provide a blanket impl and use it. No need to introduce any extra language complexity. You can even write a proc macro to automate that pattern, and I would expect that someone already published a crate for that. |
I wrote up a detailed version of roughly that approach at https://internals.rust-lang.org/t/casting-families-for-fast-dynamic-trait-object-casts-and-multi-trait-objects/12195 and yes it requires little if any rustc support. Yet as @unbrice and I discussed above one should not expect single solutions that fit everything optimally. In this case, you wind up with different |
Vtables are already duplicated between codegen units anyway. |
Relying on proc macros for something that is a rather basic expectation for how the language should behave is an admission that the language has failed. I think the request for the desugaring makes a lot of sense. Sometimes all you need to know about an argument is that it implements a set of behaviors (traits). Limiting to a single trait is needlessly burdensome on users of the language and I don't think rust should openly dismiss the desires of users for something that seems rather reasonable. |
@ProofOfKeags It's not my basic expectation. In my view, the thing proposed above is a very niche desire. Neither is it a goal of the language to pull in every feature under the sun. One-off cases should be implemented in crates, via macros or whatever other way, and the core language must have only powerful composable features which work in every supported use case. |
Given that trait objects are natively supported by the language, and we have syntax for being able to combine trait bounds (via +), it is a natural expectation for a significant portion of rust users that this would work in this way. The fact that it doesn't is a language failure. Whether you care or not is a separate issue.
This is a composability failure, I'm not sure why you aren't registering it that way. Languages are intuitive when users can take the concepts they know and apply them elsewhere. Concept 1: I can make trait objects by doing Concept 2: I can intersect trait requirements by doing Expected Composition: Reality: This doesn't work. Given that the solution seems possible by just desugaring, it seems like the approach to fixing this could be rather non-invasive. Why isn't this something that should be considered? |
@ProofOfKeags It's quite easy to do the thing you want: trait Composite: Trait1 + Trai2 {}
impl<T> Composite for T where T: Trait1 + Trait2 {}
// use Box<dyn Composite> So what you're arguing for is not some language feature which opens real new possibilities, but rather syntactic sugar for writing the two lines above. Now, syntactic sugar certainly has its place, but it still has a high bar to clear in terms of language additions vs ergonomic benefits. As it stands, the benefits are pretty low: there are two simple lines to write, and there are not that many traits which you would want to compose in the first place. In terms of semantics and composability, the cost of the feature would be much higher (at least with the proposal "just declare automatically a new trait"). Where is that composite trait declared? Ok, let's say we figure it out. Let's say we impose some fixed layout of composite traits, so that the coercion can be a no-op. But now you have declared that And what about At this point, what did you get in clarity, composability and language regularity over just forbidding sum traits in trait objects? All of these issues at the very least mean that "just declare an anonymous composite trait" isn't a workable suggestion. Now, there may be some other approach, like making |
No, it wouldn't. For the same reason that |
@bjorn3 I have literally addressed this in my comment.
That's nothing to be proud of, it's a problem to fix. Doubling down on it to get a syntax-sugar feature is unacceptable. |
In about 90% of the cases where I'd need multiple trait trait objects I have no need for any up- or downcasting and the basic "anonymous union trait" solution would work for me. I can only speak for my own usage patterns here, but calling these use cases "niche" or "one-off" does not feel appropriate. The problem with the proposed two-line workaround is how it interfaces with different libraries. Say I have a library that requires trait objects of two traits to work. So it creates the said union type, but now it must also expose it as (for example) Go has common union types declared in the standard library, but I'd still prefer having these types implicit by using the already common syntax of |
How are you going to fix duplication of vtables for types like |
@afetisov already mentioned that:
I.e. fatter trait objects. I think this approach would be worth further investigation (especially because up-cast to super/component trait objects is easy). |
True, but you get the same problem if you try to do that trick at the compiler level, as I discussed above.
I don't know what duplicates you are talking about, You must have a specific vtable at hand while compiling a crate, so that you can put the pointers to that vtable in trait object. So basically the only way to deduplicate those composite trait objects would be to punt the issue of pointer patching on the linker. This may mean baking something like lto, or at least some uncomfortable coupling to the linker, in language semantics; otherwise you can't guarantee ABI compatibility between the different sum trait objects. My personal opinion is that multiple vtable pointers is the way to go with the implementation of sum trait objects. This gives simple really zero-cost implementation of casts to subsets of traits, is simple to implement and understand, and the only downside is increased fat pointer size. On its own, I don't consider it an issue. If a pointer gets too fat, you can always use double-indirection to deal with it, or use the usual subtrait trick to get a single trait object. However, a potential hazard is that some code may rely on the current non-guarantee that dynamically sized types are two pointers long. Perhaps it's time to write a proper RFC for that design. |
I’m also in favor of fatter pointer types for sum trait objects. It seems like a very natural extension of the current trait design. Summarizing, the way I see it: Advantages: minimal compile-time cost (code size, code duplication, compilation time, compiler complexity, etc), trivial subset casting. Disadvantages: non minimal runtime cost (fatter pointers), but there are reasonable workarounds for the niche cases where too many traits are combined. Are there any other disadvantages? |
It needs a lot of changes to allow non-pointer sized pointer metadata across rustc. Especially in the codegen backends that assume every fat pointer is a ScalarPair. In addition it makes the pointers larger. Potentially a lot so. If neither was the case we would have used fatter pointers for upcasting |
With upcasting, the benefits of sumo pointers are less clear, because there is a fixed graph of supertraits, and we can optimize the layout of pointers and vtables based on that information. For example, in single-inheritance case the optimal solution is "supertraits are initial segments of subtraits" (optimizing for speed) or "supertraits contain pointers to subtraits" (optimizing for size). With sum traits, the biggest issue is that the set of summands is potentially "all traits in the artifact", and that itself is not defined until the root crate is compiled.
I assumed that the same flexibility would be required for arbitrary DST's, which are expected to be supported, even if currently super unstable. It is unfortunate if the "two pointers" assumption is still baked in. |
Yeah, custom DST's would also require it and as such rustc is far from ready for custom DST's if they were to be proposed and accepted. cg_clif is better in this regard than cg_llvm as cg_llvm matches on OperandValue (which has the Ref, Immediate and Pair variants) a lot to determine how to handle values, while cg_clif matches on the actual type and is completely fine with referencing a fat pointer by reference (as necessary for fat pointers not fitting in two pointer sized values) rather than forcing it to be put in the equivalent of |
That's fair, and certainly something to consider. But actually, I meant in usability, i.e. would sumo pointers lack some capability that people want (or not do something well enough)? |
I also hit this issue while using I am not familiar with rust (compiler) internals, but from the discussion above it appears to me that the implementation of trait inheritance is to blame here. Apparently, the sub-trait vtables includes the super-trait vtable(s). If instead, the sub-trait vtable would just contain a pointer to the vtable of each of its super-traits, all troubles vanish. This is because, given The same works for |
This is pretty much what is done with |
I suspect that in most cases, the number of individual traits used for all of a program's dynamic dispatches is small enough that we could afford one big, flat offset-table struct with nullable entries for all of them. Then casting Box<dyn A+B+C> up to Box<dyn A+C> not only wouldn't require an extra offset table, but it'd be a no-op at runtime. The total size of all offset tables would grow in O(nTraits * nImpls) rather than exponentially in nTraits. Plus this would support runtime instance-of-trait checks at no extra cost (just check whether the trait's entry is null). |
Vtables are codegened before all traits are known. There may even be new traits added at runtime using dlopen. |
Then the trait-vtable-offset tables (not the individual trait vtables, we don't have to change those) can be stored as a Vec-like type and extended when a new trait is encountered/loaded. |
That would require a global constructor in every dylib to extend those tables and to somehow locate all tables. Not every platform supported by rustc has global constructors in the first place and on those that do it slows down startup and has this proposal has issues with unloading dylibs again. Also it would likely require the registration to happen in the crate that defines the object safe vtable (of which there are many) as opposed to the one that actually turns it into a trait object (which is less common) to ensure a single fixed offset is used across the entire process. With dlopen(RTLD_LOCAL) it may not even be possible to locate all tables. |
Maybe in cases like that, some central code is needed to manage trait loading/unloading, like on the JVM. |
Rust doesn't have a runtime. |
Given arbitrary traits
Foo
andBar
, it'd be great to haveBox<Foo + Bar>
compile, but there are unresolved questions that need to be laid out explicitly. (Moved from rust-lang/rust#32220)The text was updated successfully, but these errors were encountered: