Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add encoding for f16 and f128 #122106

Closed
wants to merge 1 commit into from
Closed

Conversation

tgross35
Copy link
Contributor

@tgross35 tgross35 commented Mar 6, 2024

This establishes new character encodings for the new primitive types tracked at #116909:

  • k for f16
  • q for f128

We cannot easily be consistent with Itanium because it uses a multi-character encoding for _Float16 (Dh), which I believe would conflict with dyn anyway. We could use Itanium's g for f128, but q is more consistent with f=float=f32, d=double=f64, q=quad=f128 (unfortunately h for half/f16 is already taken).

Per @michaelwoerister this will need a compiler FCP

See also previous discussion at rust-lang/rustc-demangle#64. Currently, the compiler will just ICE if attempting to v0 mangle these types.

@rustbot label +T-compiler +needs-fcp

Cc @rcvalle @ehuss

@rustbot
Copy link
Collaborator

rustbot commented Mar 6, 2024

r? @JohnTitor

rustbot has assigned @JohnTitor.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. needs-fcp This change is insta-stable, so needs a completed FCP to proceed. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Mar 6, 2024
@tgross35
Copy link
Contributor Author

tgross35 commented Mar 6, 2024

r? @michaelwoerister

@michaelwoerister
Copy link
Member

This PR reserves space in the v0 symbol mangling grammar for the new f16 and f128 primitive types. The change to the grammar is backwards compatible and straightforward. Since the change is public-facing (i.e. it needs to be picked up by downstream tooling), let's do an FCP.

@rfcbot merge

Thanks for opening the PR, @tgross35!

@rfcbot
Copy link

rfcbot commented Mar 7, 2024

Team member @michaelwoerister has proposed to merge this. The next step is review by the rest of the tagged team members:

Concerns:

Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

@rfcbot rfcbot added proposed-final-comment-period Proposed to merge/close by relevant subteam, see T-<team> label. Will enter FCP once signed off. disposition-merge This issue / PR is in PFCP or FCP with a disposition to merge it. labels Mar 7, 2024
@wesleywiser
Copy link
Member

What happens when attempting to demangle symbols that use the new encodings with an older version of the demangling code? Since we have some level of support for v0 mangling in the wild, we should consider how making this change will affect existing tooling.

@tgross35
Copy link
Contributor Author

tgross35 commented Mar 8, 2024

What happens when attempting to demangle symbols that use the new encodings with an older version of the demangling code? Since we have some level of support for v0 mangling in the wild, we should consider how making this change will affect existing tooling.

That is a good question, the docs don't seem to call out any specific forward compatibility or error handling. It looks like rustc-demangle just gives up and errors in these cases, which likely is the same for most implementations.

I think we can probably live with the fact that existing demanglers will just error until they can update support. To me this does not feel like a breaking change because it does not change anything that currently demangles correctly.

It is probably still reasonable to try to make sure at least known demanglers support these symbols before stabilizing the primitives, I added it to the todo list #116909 (comment).

We could also update documentation to reserve a set of characters for future primitives that demanglers should gracefully handle (??) to help in case we add anything like bf16/decimal32/i256/bitintN`, but I don't think this is worth it.

@nagisa
Copy link
Member

nagisa commented Mar 9, 2024

Q: does the k cover all different formats (e.g. bfloat16) of 16-bit float, or just a specific one?

@tgross35
Copy link
Contributor Author

tgross35 commented Mar 10, 2024

Q: does the k cover all different formats (e.g. bfloat16) of 16-bit float, or just a specific one?

I think ideally it would be different, leaving k only for binary16 f16. Brain float would need to be monomorphized separately so it would probably make sense as a different symbol. Itanium uses DF16b.

@michaelwoerister
Copy link
Member

Yes, it's likely that existing demanglers will fail if they encounter something unknown in the grammar. If we had to start from scratch with a new mangling scheme, then forward compatibility would be on my set of requirements. As things are now, I think the best approach is to document new additions early (before they are emitted by a stable compiler) and try to get support merged in external tooling during that still-unstable window.

@eddyb
Copy link
Member

eddyb commented Mar 29, 2024

I somewhat regret reproducing the Itanium primitive encoding, it doesn't scale well, so its only benefit is compactly mentioning a lot of primitives (more relevant to C++ overloads than to Rust).


I have nothing against the reservation but I should point out there are alternatives.

If today rustc always generates non-zero crate disambiguators, then C3f16 would demangle correctly, and never conflict with a crate named f16.

While less pretty, we could also reserve special "namespaces" (using crate name+disambiguator values impossible to arise for real crates), or even give new primitives full identities in libcore using lang items.

@tgross35
Copy link
Contributor Author

tgross35 commented Apr 1, 2024

I somewhat regret reproducing the Itanium primitive encoding, it doesn't scale well, so its only benefit is compactly mentioning a lot of primitives (more relevant to C++ overloads than to Rust).

I have nothing against the reservation but I should point out there are alternatives.

If today rustc always generates non-zero crate disambiguators, then C3f16 would demangle correctly, and never conflict with a crate named f16.

While less pretty, we could also reserve special "namespaces" (using crate name+disambiguator values impossible to arise for real crates), or even give new primitives full identities in libcore using lang items.

Is your position neutral here, or would you prefer investigating the above at this point?

I do think there is some nice consistency in saying that all u, i, and f primitives <= 128 bits have similar representations (as proposed). Any larger types (i256, u256, f256) or anything not following this pattern (bf16) would then be a reasonable place to switch to a more generic representation.


@wesleywiser were the above answers reasonable enough, or did you have any further concerns? For what it is worth, one demangler has already picked up the additions #116909 (comment).

@Mark-Simulacrum
Copy link
Member

It seems like giving these primitives full identities in libcore is pretty cheap (we'll want methods on them anyway) and presumably means that there's no 3-5 year period while tooling starts to support demangling symbols with them present?

That feels like a much better option to me, the only thing we're gaining by doing the current reservation approach seems to be a bit of efficiency in symbol length, right? That seems like a pretty minor win (symbols are already pretty long).

I'm still constantly on systems that don't demangle baseline v0 out of the box in perf or other tooling, so this would essentially reset the clock (albeit only for code using these types...).

@michaelwoerister
Copy link
Member

If the expectation is that there'll be more additions like this (e.g. the bf16 type mentioned) then I would also prefer emitting regular paths (e.g. to core::primitives aliases). That would allow currently demanglers to decode symbols containing the new types. If we decide to do that here, that should also be the policy for all similar cases going forward.

@jackh726
Copy link
Member

jackh726 commented Apr 4, 2024

I'd like to echo the above points as a soft concern (I'm not convinced to check my box, but wouldn't block if it would otherwise land).

I think if we foresee future changes like this, we should consider that doing this dance every time (FCPing, waiting for downstream support, thinking about backwards-compatibility, etc.) is not ideal and that there may be better approaches (like emitting regular paths mentioned above).

@michaelwoerister
Copy link
Member

Given the above conversation I guess I'm going to file a concern on my own FCP 🙂

@rfcbot concern Compatibility with existing tools

The extremely long latency before downstream tooling like perf or gdb can pick up changes to the grammar has been by far the main obstacle to using the v0 scheme by default. The change to the grammar here is minimal but both @eddyb's C3f16 proposal and using a full path to something in libcore are good alternatives -- especially when considering that these types won't be very common in symbol names.

If noboby objects, I'm going to cancel the FCP and suggest that we just use C3f16 for now (which does not require a grammar change or an FCP)

@michaelwoerister
Copy link
Member

Here's a proposal for solving these kinds of problems in a more structured way: rust-lang/compiler-team#737

@michaelwoerister
Copy link
Member

I'm seeing no objections to cancelling the FCP, so:
@rfcbot fcp cancel

I think MCP 737 provides a good, general solution to evolving the mangling scheme without too much breakage.

@rfcbot
Copy link

rfcbot commented Apr 10, 2024

@michaelwoerister proposal cancelled.

@rfcbot rfcbot removed proposed-final-comment-period Proposed to merge/close by relevant subteam, see T-<team> label. Will enter FCP once signed off. disposition-merge This issue / PR is in PFCP or FCP with a disposition to merge it. labels Apr 10, 2024
tgross35 added a commit to tgross35/rust that referenced this pull request Apr 11, 2024
As discussed at <rust-lang#122106>, use the
crate encoding to represent new primitives.
@tgross35
Copy link
Contributor Author

Thanks everyone for the feedback, I have opened #123816 to supercede this.

@tgross35 tgross35 closed this Apr 11, 2024
@tgross35 tgross35 deleted the patch-1 branch April 11, 2024 18:33
tgross35 added a commit to tgross35/rust that referenced this pull request Apr 12, 2024
As discussed at <rust-lang#122106>, use the
crate encoding to represent new primitives.
tgross35 added a commit to tgross35/rust that referenced this pull request Apr 20, 2024
As discussed at <rust-lang#122106>, use the
crate encoding to represent new primitives.
tgross35 added a commit to tgross35/rust that referenced this pull request May 14, 2024
As discussed at <rust-lang#122106>, use the
crate encoding to represent new primitives.
bors added a commit to rust-lang-ci/rust that referenced this pull request May 14, 2024
…lwoerister

Add v0 symbol mangling for `f16` and `f128`

As discussed at <rust-lang#122106>, use the crate encoding to represent new primitives.
cuviper pushed a commit to cuviper/rust that referenced this pull request May 25, 2024
As discussed at <rust-lang#122106>, use the
crate encoding to represent new primitives.

(cherry picked from commit 809b84e)
workingjubilee added a commit to workingjubilee/rustc that referenced this pull request Jun 12, 2024
…r-demangling-recommendation, r=davidtwco

Recommend to never display zero disambiguators when demangling v0 symbols

This PR extends the [v0 symbol mangling documentation](https://doc.rust-lang.org/rustc/symbol-mangling/v0.html) with the strong recommendation that demanglers should never display zero-disambiguators, especially when dealing with `crate-root`.

Being able to rely on `C3foo` to be rendered as `foo` (i.e. without explicit disambiguator value) rather than as `foo[0]` allows the compiler to encode things like new basic types in a backward compatible way. This idea has been originally proposed by `@eddyb` in [the discussion around supporting `f16` and `f128` in the v0 mangling scheme](rust-lang#122106). It is a generally useful mechanism for supporting a certain class of new elements in the v0 mangling scheme in a backward compatible way (whether as a temporary workaround until downstream tooling has picked up grammar changes or as a permanent encoding).

cc `@tgross35`
rust-timer added a commit to rust-lang-ci/rust that referenced this pull request Jun 12, 2024
Rollup merge of rust-lang#124514 - michaelwoerister:zero-disambiguator-demangling-recommendation, r=davidtwco

Recommend to never display zero disambiguators when demangling v0 symbols

This PR extends the [v0 symbol mangling documentation](https://doc.rust-lang.org/rustc/symbol-mangling/v0.html) with the strong recommendation that demanglers should never display zero-disambiguators, especially when dealing with `crate-root`.

Being able to rely on `C3foo` to be rendered as `foo` (i.e. without explicit disambiguator value) rather than as `foo[0]` allows the compiler to encode things like new basic types in a backward compatible way. This idea has been originally proposed by `@eddyb` in [the discussion around supporting `f16` and `f128` in the v0 mangling scheme](rust-lang#122106). It is a generally useful mechanism for supporting a certain class of new elements in the v0 mangling scheme in a backward compatible way (whether as a temporary workaround until downstream tooling has picked up grammar changes or as a permanent encoding).

cc `@tgross35`
mattstam pushed a commit to succinctlabs/rust that referenced this pull request Aug 1, 2024
As discussed at <rust-lang/rust#122106>, use the
crate encoding to represent new primitives.

(cherry picked from commit 809b84e)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs-fcp This change is insta-stable, so needs a completed FCP to proceed. S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants