Support for limited non-`'static` casts #6

sagebind · 2022-04-09T17:51:23Z

Allow casting to specific lifetime-free types from non-'static generic sources in specific contexts.

Figure out how to get autoderef under control
Figure out API
Determine if and how users might unsafely mark additional types as being lifetime-free
- Punt on this for now

Work in progress.

NobodyXu · 2022-04-10T01:48:14Z

Is it possible to cast a &T to &i32 using the new API?

sagebind · 2022-04-15T23:55:06Z

Not currently but it is theoretically possible to support, yes. I suspect it will be supported by the time I am able to release this.

sagebind · 2022-04-16T04:48:22Z

As far as API goes, I've figured out how to implement this automagically without any API changes. So cast!(value, u8) will just work even if value is not 'static.

src/lifetime_free.rs

NobodyXu · 2022-04-17T06:24:14Z

Determine if and how users might unsafely mark additional types as being lifetime-free

I think it might be useful, but maybe we can expose this functionality in another release?

chris-morgan · 2022-06-04T13:05:14Z

src/utils.rs

+/// Determine if two static, generic types are equal to each other.
+#[inline(always)]
+pub(crate) fn type_eq<T: 'static, U: 'static>() -> bool {
+    // Reduce the chance of `TypeId` collisions causing a problem by also


TypeId is by definition globally unique. To be sure, collisions can currently occur, but I still don’t think these additional checks are warranted, given the extreme rarity of the issue, that it’s acknowledged to be a soundness hole in Rust, that a fix (more or less) is in the pipeline, and that the additional checks here still don’t catch everything.

Fair enough, though none of these checks show up in a final binary since they are trivially optimized away so it doesn't really hurt. (As an aside, these checks were already present before this PR, they're just being rearranged here.)

chris-morgan · 2022-06-04T13:29:48Z

src/utils.rs

+    // a function which references itself. This is something that LLVM cannot
+    // merge, since each monomorphized function has a reference to a different
+    // global alias.
+    type_id_of::<T>() == type_id_of::<U>()


I strongly suspect this invokes undefined behaviour that makes it unsound, and that it only works by accident at present.

It assumes that (a) unused function pointers are not optimised out of existence; and (b) monomorphisation doesn’t merge these functions. I’d be surprised if the compiler guaranteed either of these. They both really feel like places where it only works because we don’t have a sufficiently smart compiler yet. (I know that when llogiq made his crate momo there was widespread agreement that the compiler should demonomorphise lots of stuff.)

If the compiler does start merging these, then you have transmute in safe code, with the type_name check bearing all the load of memory safety, which is a worrying thought.

This ain’t my crate, and I have no plan on using it (just came across it incidentally), but I suggest reverting this.

Regarding (a): The function pointers are not unused, since the function being invoked is itself the function that returns its own function pointer. When it is optimized away (at least in my inspections of what code is generated at various optimization levels) the expression itself is also optimized away, at which point the function pointers are no longer relevant (we're left with a folded constant true or false, which is all we really wanted). I have a hard time seeing the optimization step acting much differently for this code, though I suppose anything is possible.

Regarding (b): I walked through the source of LLVM's function merging implementation at several different versions to make my "hunch" a bit more objective, and at least today this is something that LLVM won't merge. However, you are right that it is possible that this will change in the future, and this says nothing of other rustc codegen backends or of optimization steps in rustc itself.

A better summary of the trick is: "These generic functions can't be merged unless they are already merged in order to have the same address space." My intuition could very well be wrong here, but it seems like merging such a function could actually result in incorrect observable behavior in some scenarios so it isn't clear to me that an optimization like this ever would be implemented intentionally, at least not without something smarter that understands the broader context of how the pointer is being used within the function.

To be sure, this is a hack, and not an ideal solution. Ideally, TypeId would have a way of being used without a 'static bound with the caveats that type comparisons ignore lifetimes, since we do our own independent checking of lifetime requirements with the LifetimeFree trait. However, it is unlikely that this will ever be supported since RFC 1849 was retracted.

This ain’t my crate, and I have no plan on using it (just came across it incidentally), but I suggest reverting this.

I did spend quite a bit of time weighing the pros and cons of this PR, but ultimately it did seem to me that at least currently the risk here is relatively low. Though to admit my bias, retracting this entire feature now, months after already being released, would be a pain.

eddyb · 2022-06-09T10:30:24Z

@sagebind replying out of line to #6 (comment) to avoid it being lost:

There is a way to compare types with lifetimes in it using TypeId, and it's really tricky to get lifetimes right (specifically types with more than one lifetime parameter - like, Foo<'a, 'static> and Foo<'static, 'a> must be detected as different) otherwise:

// This is emulating `type<'_>` HKTs and also could be simpler with GATs.
trait ApplyL<'a> {
    type Out;
}
trait LambdaL: for<'a> ApplyL<'a> {}
impl<X: for<'a> ApplyL<'a>> LambdaL for X {}

// `type<'_>` marker types (this can be automated with macros).
enum FooL1 {}
impl<'a> ApplyL<'a> for FooL1 {
    type Out = Foo<'a, 'static>;
}
enum FooL2 {}
impl<'a> ApplyL<'a> for FooL2 {
    type Out = Foo<'static, 'a>;
}
enum FooL1_2 {}
impl<'a> ApplyL<'a> for FooL1_2 {
    type Out = Foo<'a, 'a>;
}

// "Marked" type to use behind `dyn Any` to force downcasting to know
// the "marker" type `M`, even if it doesn't affect the data type.
struct Marked<M, T>(PhantomData<M>, T);

type MarkedForAnyL<X: LambdaL> = Marked<
    // Type that `TypeId::of` will effectively turn into a "structural" ID for
    // the shape of `for<'a> X<'a>` (i.e. multiple `ApplyL` implementors can
    // have the exact same `type Out = ...` and will be considered equal here).
    // The `&mut` allows `!Sized` `Out` types and reinforces invariant polymorphism.
    for<'a> fn(&'a mut <X as ApplyL<'a>>::Out),

    // "Storage" type that's `Any`-friendly (`'static` if `X` is).
    <X as ApplyL<'static>>::Out
>;

// Lifetime-invariant newtype of `dyn Any` (with custom ctors and downcasting).
pub struct AnyL<'a>(PhantomData<&'a mut &'a ()>, dyn Any);

impl<'a> AnyL<'a> {
    pub fn new_boxed<X: LambdaL + 'static>(
        x_a: <X as ApplyL<'a>>::Out,
    ) -> Box<AnyL<'a>> {
        // Erase `'a` (for `Any`-compatible storage).
        let x_static: <X as ApplyL<'static>>::Out
            = unsafe { transmute(x_a) };

        // Ensure that `for<'a> X<'a>`'s "shape" ends up in the `TypeId`.
        let marked_box: Box<MarkedForAnyL<X>>
            = Box::new(Marked(PhantomData, x_static));

        unsafe { transmute::<Box<dyn Any>, Box<AnyL<'a>>>(marked_box) }
    }

    pub fn downcast_ref<X: LambdaL + 'static>(&self) -> Option<&<X as ApplyL<'a>>::Out> {
        // Downcast to the type marked with `for<'a> X<'a>`'s "shape"
        let x_static: &<X as ApplyL<'static>>::Out
            = &self.1.downcast_ref::<MarkedForAnyL<X>>()?.1;

        // Reinject `'a` into the lifetime-erased "storage" type.
        Some(unsafe { transmute(x_static) })
    }
}

Sadly some of the shenanigans involved might still make the compiler struggle, and it's not going to be anywhere near pretty until we get GATs, but such an approach ensures FooL1, FooL2 and FooL1_2 result in distinct TypeIds, which would otherwise be impossible to distinguish (you'd have to either ban multiple-lifetime-position types or force the FooL1_2 version, and even then how do you guarantee it's used correctly?).

As for the function instantiation address taking thing: it simply looks broken/unsound - the only reason LLVM mergefunc would even consider them distinct is its implementation being too simple and nowhere near "graph isomorphism" (remember that we have unnamed_addr semantics for functions, which makes it valid to merge functions even if their addresses are observed).

cc @davidtwco can we make polymorphization treat T as unused in this snippet:

    fn unsound_type_id<T: ?Sized>() -> usize {
        unsound_type_id::<T> as usize
    }

It would be good to break anything like this before it gets used widely, as I can't find any theoretical justification for it (after all, a recursive self-call can't on its own be considered to use T, and observing the address must be strictly less meaningful than calling the function).

EDIT: I am being told by @lcnr (also working on polymorphization) that this will break at some point the future (assuming we get to turn that on by default), so this is not that theoretical.

Also, missed this the first time around:

This is something that LLVM cannot merge, since each monomorphized function has a reference to a different global alias.

At the risk of repeating myself, this isn't true, the "different global alias" can still have the same address thanks to unnamed_addr, and a more "bruteforce" search over the LLVM module should be able find the solution where merging the functions is possible, it would just be too expensive.

NobodyXu · 2022-06-09T10:46:41Z

@eddyb Then we definitely need another way to calculate the type id.

As for the lifetime, this is actually a none-issue for castaway because it only supports casting to types obeys LifetimeFree:

without any lifetime annotation
has 'static lifetime
all fields of it also obeys LifetimeFree.

So the only question now is to find a new way of calculating typeid for any type.

eddyb · 2022-06-09T11:07:45Z

I don't understand LifetimeFree. When can T: LifetimeFree hold but T: 'static doesn't?

If a lifetime parameter exists but can be ignored (for whatever reason), you could have a trait that instead has an associated type Erased: 'static; and convert between the original and erased types.

(EDIT: @lcnr is explaining it to me, there was a thing I missed, but still it's very confusing how the important part isn't really explained, and the trait doesn't have the : 'static bound I would've expected)

NobodyXu · 2022-06-09T11:30:31Z

I don't understand LifetimeFree. When can T: LifetimeFree hold but T: 'static doesn't?

Basically LifeTimeFree disallows struct S<'a> { PhantomData<&'a () } but allows struct S; to avoid any problem related to lifetime.

eddyb · 2022-06-09T11:35:50Z

I think part of the confusion here is that the trait and the cast are too far removed from the actual needs / reasoning behind this being possible at all.

IIUC, you want a "failible upcasting" of e.g. &T -> Option<&dyn Any>, which requires no bounds and succeeds when T happens to be a type that is "always 'static" pre-erasure, which AFAIK is described as "a type with no lifetime positions".

Or at least a version of TypeId::of that is -> Option<TypeId> with the same constraint.

In a different language, it could probably like if (T: 'static) { Some(TypeId::of::<T>()) } else { None }, but T: 'static here isn't really the same as the normal bound (it's more like a Never | Maybe | Always kind of thing, except Never doesn't make sense for T: 'static).

I can't think of any way to support this other than making it an official part of Any/TypeId.

However, there's an important distinction: this is still about T: 'static, just in a "worst-case specialization"/"dynamic" way, there's no way to use it to implement e.g. an unsound version of AnyL<'a>, as long as we add new functionality instead of relaxing the existing one.

So it might be way more amenable to getting added to Rust, compared to past proposals.

NobodyXu · 2022-06-09T11:40:51Z

That's kind of what castaway is doing here:

We only permit upcasting to T: 'static, but we can accept any non-'static type.
Otherwise, cast! and match! in castaway would end up having 'static bound everywhere and it makes it kind of hard to use.

So it might be way more amenable to getting added to Rust, compared to past proposals.

A better way would be to stablise specialisation, this would remove all the hack used in castaway and also add the ability to specialise for types implementing specific traits, or forward &&T to &T.

eddyb · 2022-06-09T11:50:33Z

A better way would be to stablise specialisation

Specialization has no way to allow specializing specifically in the situations where TypeId::of_lifetimeless (to give a name to my suggestion above) would return Some, and allowing more than that is generally unsound.

E.g. impl<T> TypeEq<T> for T {} can't be allowed as a specialization because it would impose constraints on lifetimes if ever used and there's no built-in trait to limit TypeEq in a way that the language would understand as being sound (even then, it would likely have to be on both sides i.e. trait TypeEq<T: NoLifetimes>: NoLifetimes, which would be worse than 'static bounds).

(Or are you suggesting replacing all of the uses of castaway with specialization?)

Otherwise, cast! and match! in castaway would end up having 'static bound everywhere and it makes it kind of hard to use.

Just "hard" or "impossible"? If this used like specialization, isn't it impossible to add the 'static bound in the first place, to whatever is being specialized?

NobodyXu · 2022-06-09T11:56:52Z

(Or are you suggesting replacing all of the uses of castaway with specialization?)

Yes, because castaway is essentially there to workaround lack of specialization.
If that get stablized, then we won't need this crate anymore.

Though it seems that the RFC is no where near stablization.

Just "hard" or "impossible"? If this used like specialization, isn't it impossible to add the 'static bound in the first place, to whatever is being specialized?

So castaway!($value, $type) basically returns a Ok if $value is of $type, or Err.

Without this PR, the type of $value must be 'static, making it impossible to use in many cases, including ToCompactString which I worked on and added specialization to it using castaway.

eddyb · 2022-07-24T21:00:49Z

Looks like @dtolnay is taking an interest in this kind of approach, might want to keep an eye on that in case it turns into an RFC (or join the discussion etc.).

According to #6 (comment) in reference to the `type_id_of::<T> as usize`-based implementation: > As for the function instantiation address taking thing: it simply looks broken/unsound - the only reason LLVM mergefunc would even consider them distinct is its implementation being too simple and nowhere near "graph isomorphism" (remember that we have `unnamed_addr` semantics for functions, which makes it valid to merge functions _even if their addresses are observed_). > I am being told by lcnr (also working on polymorphization) that this will break at some point the future (assuming we get to turn that on by default), so this is not that theoretical.

Support for limited non-'static casts

ed89b5d

Work in progress.

sagebind mentioned this pull request Apr 9, 2022

Specialisation for builtin type, without static lifetime constraint? #5

Closed

NobodyXu mentioned this pull request Apr 10, 2022

Add && Impl new trait ToCompactStr ParkMyCar/compact_str#16

Merged

sagebind added 2 commits April 15, 2022 20:16

Implement LifetimeFree for more types

f63c1d9

Resolve LifetimeFree casts with reworked autoderef

e002bd2

Add rough unsafe AssertLifetimeFree API

134032f

NobodyXu suggested changes Apr 16, 2022

View reviewed changes

src/lifetime_free.rs Show resolved Hide resolved

Document MSRV, add NonZero impls

0a78bcd

sagebind added 5 commits April 20, 2022 17:10

Remove AssertLifetimeFree for now

a7f4369

Fix wrong number of refs ignoring final lifetime-free impl

a5657d1

Rename transmute_owned to transmute_unchecked for clarity

bc9a218

Update example

0102a98

Enable std feature by default

8a408ff

sagebind marked this pull request as ready for review April 20, 2022 23:05

sagebind merged commit bd8ea00 into master Apr 20, 2022

sagebind deleted the non-static branch April 20, 2022 23:10

chris-morgan reviewed Jun 4, 2022

View reviewed changes

eddyb mentioned this pull request Jul 24, 2022

Tracking issue for non_static_type_id rust-lang/rust#41875

Closed

3 tasks

dtolnay mentioned this pull request Sep 19, 2023

More robust type_eq_non_static implementation #14

Merged

miraclx mentioned this pull request May 8, 2024

feat(sdk): drop custom type discriminant calimero-network/core#261

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support for limited non-`'static` casts #6

Support for limited non-`'static` casts #6

sagebind commented Apr 9, 2022 •

edited

Loading

NobodyXu commented Apr 10, 2022 •

edited

Loading

sagebind commented Apr 15, 2022

sagebind commented Apr 16, 2022

NobodyXu commented Apr 17, 2022

chris-morgan Jun 4, 2022

sagebind Jun 4, 2022

chris-morgan Jun 4, 2022

sagebind Jun 4, 2022

eddyb commented Jun 9, 2022 •

edited

Loading

NobodyXu commented Jun 9, 2022

eddyb commented Jun 9, 2022 •

edited

Loading

NobodyXu commented Jun 9, 2022

eddyb commented Jun 9, 2022

NobodyXu commented Jun 9, 2022

eddyb commented Jun 9, 2022 •

edited

Loading

NobodyXu commented Jun 9, 2022 •

edited

Loading

eddyb commented Jul 24, 2022

Support for limited non-'static casts #6

Support for limited non-'static casts #6

Conversation

sagebind commented Apr 9, 2022 • edited Loading

NobodyXu commented Apr 10, 2022 • edited Loading

sagebind commented Apr 15, 2022

sagebind commented Apr 16, 2022

NobodyXu commented Apr 17, 2022

chris-morgan Jun 4, 2022

Choose a reason for hiding this comment

sagebind Jun 4, 2022

Choose a reason for hiding this comment

chris-morgan Jun 4, 2022

Choose a reason for hiding this comment

sagebind Jun 4, 2022

Choose a reason for hiding this comment

eddyb commented Jun 9, 2022 • edited Loading

NobodyXu commented Jun 9, 2022

eddyb commented Jun 9, 2022 • edited Loading

NobodyXu commented Jun 9, 2022

eddyb commented Jun 9, 2022

NobodyXu commented Jun 9, 2022

eddyb commented Jun 9, 2022 • edited Loading

NobodyXu commented Jun 9, 2022 • edited Loading

eddyb commented Jul 24, 2022

Support for limited non-`'static` casts #6

Support for limited non-`'static` casts #6

sagebind commented Apr 9, 2022 •

edited

Loading

NobodyXu commented Apr 10, 2022 •

edited

Loading

eddyb commented Jun 9, 2022 •

edited

Loading

eddyb commented Jun 9, 2022 •

edited

Loading

eddyb commented Jun 9, 2022 •

edited

Loading

NobodyXu commented Jun 9, 2022 •

edited

Loading