Allow using $:literal containing integer to index into a tuple #91166

dtolnay · 2021-11-24T00:55:38Z

Example of the failure prior to this PR:

macro_rules! m {
    ($thing:ident . $field:literal) => {
        $thing . $field
    };
}

fn main() {
    let x = ('x',);
    let _ = m!(x.0);
}

error: unexpected token: `0`
 --> src/main.rs:3:18
  |
3 |         $thing . $field
  |                  ^^^^^^
...
9 |     let _ = m!(x.0);
  |             ------- in this macro invocation
  |
  = note: this error originates in the macro `m` (in Nightly builds, run with -Z macro-backtrace for more info)

rust-highfive · 2021-11-24T00:55:41Z

r? @cjgillot

(rust-highfive has picked a reviewer for you, use r? to override)

cjgillot · 2021-11-27T13:44:43Z

The implementation seems fine.
Since this allows more things in macro_rules, does lang team need to sign off on this?

scottmcm · 2021-12-03T20:37:26Z

Nominated for lang team discussion.

@dtolnay What is this going to do for the ol' x.0.0? For example, can I do this?

fn main() {
    let x = (('x',),);
    let _ = m!(x.0.0);
}

(Where 0.0 gets matched as a float literal.)

dtolnay · 2021-12-04T01:41:07Z

This PR keeps the current behavior for $ident . $float but I can change that too if we want it.

petrochenkov · 2021-12-04T03:33:26Z

There's a definitive method to determine whether something interpolated should parse or not - checking how it would work in token-based expansion model (aka "how it would work in proc macro output"), as opposed to AST-based model that is currently used for implementing $interpolated fragments internally.

$thing . $field in the example above will expand into x . ⟪ 0 ⟫ where ⟪ ... ⟫ is a Delimiter::None group that is more or less equivalent to regular ( ... ) parentheses.
Only single-token fragments should be expanded without Delimiter::None, now it's only $tt ($ident is also a single-token fragment, but it's currently wrapped, but we should change that and remove the wrapping).

Do we want x . ( 0 ) to parse as a field access? No.
So this change should not go alone and should be accompanied with a change to $interpolated fragment wrapping, e.g. we need to stop wrapping single-token $lit fragments and only keep the wrapping for literals with -, or something like that.

Also see #67062 for the current issues with parsing Delimiter::Nones.

dtolnay · 2021-12-04T04:43:15Z

I'm not sure I agree this change would need to be bundled together with eliminating Delimiter::None wrappers around $:literal passed into a proc macro input. Tweaking the token stream passed into proc macros can be quite disruptive, as we saw during the process of fixing #43081 (which caused a new None wrapper around $:ident in some cases). Meanwhile this change for macro_rules is important for tt-munchers to be able to handle Rust expression syntax, as I came across in anyhow::ensure!:

((…) . $member:ident $($rest:tt)*) => {…};
((…) . $member:literal $($rest:tt)*) => {…};

It doesn't seem great to have to disrupt proc macros in order for macro_rules to get this improvement.

scottmcm · 2021-12-21T13:16:16Z

We had a long conversation about this at the last lang team meeting. My takeaway was that we were open for some kind of change around this area, but weren't sure what form that should be.

To attempt to summarize some of the things that came up:

What's the right way to think of the distinction between literal or tt here? There are of course a ton of literals that definitely wouldn't work after expansion, like 0_i32 or 0.0e0. Could the use case work with just a tt instead? Why or why not?
We'd expect this to not work if a 0 would be matched as an expr, right? Is there something fundamental about the literal vs the general expression here?
From a "well it's just tokens" perspective, why should this work for 0 but not 0.0 when the 0.0 version works when matched as a tt? https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=9c7eeb41783c34c656364b1b885e4fe7
Should we add a new fieldname for use in macro_rules? It might be useful for things like matching struct definitions and literals too.

petrochenkov · 2021-12-22T06:12:14Z

What's the right way to think of the distinction between literal or tt here?

tts are meaningless and can be inserted into any token stream and blend into it without keeping any "parsing priorities".
E.g. if 1 + 2 is inserted into $interpolated * 3 as tts, then the result will be 1 + 2 * 3, resulting in 2 blending with * 3 into an expression.

Any other matchers are meaningful and must keep their meaning (or at least grouping) when inserted into other token streams.
E.g. if 1 + 2 is inserted into $interpolated * 3 as expr, then the result will be ⟪ 1 + 2 ⟫ * 3, so 1 + 2 keeps being an expression after re-parsing and its parts don't blend with tokens from the adjacent context.

We conservatively emit the Delimiter::None brackets for all matchers except tt to avoid their tokens partially blending with the context, but it's not always necessary.
For example, ident matcher is just a subset of tt, it always represents a single token, it cannot blend with anything (it's currently conservatively wrapped into Delimiter::None anyway, but I'd like to remove that wrapping).
The literal matcher is not always a single token, fortunately or not, it accepts things like - 1, which can be torn apart after insertion.
So the ⟪ - 1 ⟫ brackets are necessary at least in some cases, otherwise things like - 1 inserted into $literal () will parse incorrectly after insertion.
If 0.0 turns into 3 tokens (which I want to do, but it's not done yet), then it will requires the ⟪ 0.0 ⟫ brackets too, because without the brackets it will blend with the context in cases like tuple . 0.0, which it should not do because it must keep its meaning as a literal.

petrochenkov · 2021-12-22T06:18:30Z

The most important part to realize is that non-terminal tokens do not exist during parsing.
Yes, they may exist in the current implementation because the current implementation is old, but the language rules should work as if they didn't exist.

Imagine that we pass pass all tokens through a proc macro before parsing, proc macros do not support AST pieces by design, then all interpolated tokens are turned into their corresponding token streams, possibly including Delimiter::None brackets.
The language syntax must then be defined in terms of such token streams, and never non-terminal tokens.

petrochenkov · 2021-12-22T06:33:28Z

We'd expect this to not work if a 0 would be matched as an expr, right? Is there something fundamental about the literal vs the general expression here?

I don't see any difference between expr and literal matchers here, both can be potentially multi-token (or potentially single token).

If they are multi-token then they surely be guarded by Delimiter::None brackets during tokenization before parsing.
If they are single-token, then it's kind of a grey territory, there's no need to shield them, so the answer depends on the amount of conservativeness with which we add extra Delimiter::None brackets.

Right now we always add the brackets around all non-tt matchers, so both tuple.$literal and tuple.$expr are equivalent to tuple.⟪ 0 ⟫ (remember, nonterminal tokens do not exist) so if tuple.⟪ 0 ⟫ is syntactically correct, then both tuple.$literal and tuple.$expr must parse.

In theory we can omit the brackets from all single-token cases.
(This shouldn't even be a large breaking change, the problems usually arise when you add brackets and parsers don't expect them like #91166 (comment) says.)
In this case both tuple.$literal and tuple.$expr will also be equivalent, but now to tuple . 0 (again, nonterminal tokens do not exist + single-token matchers are not wrapped into brackets now), so again both tuple.$literal and tuple.$expr must parse.

Some hybrid schemes are also possible - we omit unnecessary Delimiter::Nones in some cases but not in others.
Then you can make literal and expr behave differently.
It just doesn't seem to be a very consistent approach.

petrochenkov · 2021-12-22T06:40:39Z

From a "well it's just tokens" perspective, why should this work for 0 but not 0.0 when the 0.0 version works when matched as a tt? https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=9c7eeb41783c34c656364b1b885e4fe7

See #91166 (comment) on the difference between "unstructured" tt and other matchers.

Should we add a new fieldname for use in macro_rules? It might be useful for things like matching struct definitions and literals too.

No opinion on this, I don't see anything harmful it it except for generally expanding the language surface.
Note: the fieldname matcher will always produce a single token, so it doesn't strictly require the Delimiter::None wrapping, similarly to ident.

dtolnay · 2021-12-22T06:56:52Z

The literal matcher is not always a single token, fortunately or not, it accepts things like - 1, which can be torn apart after insertion.
So the ⟪ - 1 ⟫ brackets are necessary at least in some cases, otherwise things like - 1 inserted into $literal () will parse incorrectly after insertion.

A None group containing two tokens -1 is not what I would want. IMO ideally $:ident would be 1:1 with TokenTree::Ident (i.e. passed to proc macros with no None around it ever) and $:literal would be 1:1 with TokenTree::Literal. Proc macro literals already support negatives just the same as $:literal does. For example you can create a negative literal single token with Literal::i8_unsuffixed(-1).

As long as nonterminal tokens still exist in the old parser implementation, I'd like to make them behave consistently with the above treatment of $:ident and $:literal. I think it's okay for fixes in that direction to land in the parser independently of whether the $:ident and $:literal nonterminals passed to a proc macro are still being wrapped in None.

bors · 2022-01-19T02:55:45Z

☔ The latest upstream changes (presumably #93048) made this pull request unmergeable. Please resolve the merge conflicts.

Currently fails to build: error: unexpected token: `0` --> src/test/ui/macros/macro-interpolation.rs:19:16 | LL | $var . $field | ^^^^^^ ... LL | let _ = field!(tuple.0); | --------------- in this macro invocation | = note: this error originates in the macro `field` (in Nightly builds, run with -Z macro-backtrace for more info) error: macro expansion ignores token `0` and any following --> src/test/ui/macros/macro-interpolation.rs:19:16 | LL | $var . $field | ^^^^^^ ... LL | let _ = field!(tuple.0); | --------------- caused by the macro expansion here | = note: the usage of `field!` is likely invalid in expression context error: aborting due to 2 previous errors

dtolnay · 2022-01-19T05:49:02Z

Rebase to resolve conflict with Let qpath contain NtTy: <$:ty as $:ty>::… #91150

dtolnay · 2022-01-19T05:55:21Z

Gonna mark this as blocked on at least #92392 to put $:literal on a path to using a single token always on the proc macro side. Personally I'd be fine landing this one first but I respect that folks want some guarantee that the macro_rules behavior and proc macro behavior on literals will be able to stay in sync.

dtolnay · 2022-01-25T20:44:25Z

I was able to work around this not working by duplicating all my macro input tokens: ($($tt)*) ($($tt)*) and then matching one set of them for $:literal and using the other set for the actual indexing expression:

(($expr:expr) ($dot:tt $literal:tt $($rest:tt)*) (. $unusable_literal:literal $($rest:tt)*)) => {
    $expr $dot $literal
};

(The full macro: https://github.com/dtolnay/anyhow/blob/1.0.53/src/ensure.rs#L310)

#92392 has been closed so I'll close this as well. I still think this should work but for now I'm unblocked so someone else will need to drive this. :)

rust-highfive assigned cjgillot Nov 24, 2021

rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Nov 24, 2021

camelid added the T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. label Nov 25, 2021

dtolnay assigned scottmcm and unassigned cjgillot Nov 27, 2021

scottmcm added the I-lang-nominated Nominated for discussion during a lang team meeting. label Dec 3, 2021

petrochenkov mentioned this pull request Dec 4, 2021

Let qpath contain NtTy: <$:ty as $:ty>::… #91150

Merged

scottmcm added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed I-lang-nominated Nominated for discussion during a lang team meeting. labels Dec 21, 2021

petrochenkov self-assigned this Dec 21, 2021

petrochenkov removed the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Dec 22, 2021

dtolnay mentioned this pull request Dec 29, 2021

Store NtLiteral without generalizing to Expr #92392

Closed

petrochenkov mentioned this pull request Jan 7, 2022

proc-macro: Stop wrapping ident matchers into groups #92472

Merged

dtolnay added 2 commits January 18, 2022 21:44

Uninterpolate literal to use as field index

aadf994

dtolnay force-pushed the literal branch from 3ddd318 to aadf994 Compare January 19, 2022 05:48

dtolnay added S-blocked Status: Marked as blocked ❌ on something else such as an RFC or other implementation work. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Jan 19, 2022

dtolnay closed this Jan 25, 2022

dtolnay deleted the literal branch January 25, 2022 20:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow using $:literal containing integer to index into a tuple #91166

Allow using $:literal containing integer to index into a tuple #91166

dtolnay commented Nov 24, 2021

rust-highfive commented Nov 24, 2021

cjgillot commented Nov 27, 2021

scottmcm commented Dec 3, 2021 •

edited

Loading

dtolnay commented Dec 4, 2021

petrochenkov commented Dec 4, 2021 •

edited

Loading

dtolnay commented Dec 4, 2021 •

edited

Loading

scottmcm commented Dec 21, 2021 •

edited

Loading

petrochenkov commented Dec 22, 2021 •

edited

Loading

petrochenkov commented Dec 22, 2021 •

edited

Loading

petrochenkov commented Dec 22, 2021 •

edited

Loading

petrochenkov commented Dec 22, 2021

dtolnay commented Dec 22, 2021 •

edited

Loading

bors commented Jan 19, 2022

dtolnay commented Jan 19, 2022

dtolnay commented Jan 19, 2022

dtolnay commented Jan 25, 2022

Allow using $:literal containing integer to index into a tuple #91166

Allow using $:literal containing integer to index into a tuple #91166

Conversation

dtolnay commented Nov 24, 2021

rust-highfive commented Nov 24, 2021

cjgillot commented Nov 27, 2021

scottmcm commented Dec 3, 2021 • edited Loading

dtolnay commented Dec 4, 2021

petrochenkov commented Dec 4, 2021 • edited Loading

dtolnay commented Dec 4, 2021 • edited Loading

scottmcm commented Dec 21, 2021 • edited Loading

petrochenkov commented Dec 22, 2021 • edited Loading

petrochenkov commented Dec 22, 2021 • edited Loading

petrochenkov commented Dec 22, 2021 • edited Loading

petrochenkov commented Dec 22, 2021

dtolnay commented Dec 22, 2021 • edited Loading

bors commented Jan 19, 2022

dtolnay commented Jan 19, 2022

dtolnay commented Jan 19, 2022

dtolnay commented Jan 25, 2022

scottmcm commented Dec 3, 2021 •

edited

Loading

petrochenkov commented Dec 4, 2021 •

edited

Loading

dtolnay commented Dec 4, 2021 •

edited

Loading

scottmcm commented Dec 21, 2021 •

edited

Loading

petrochenkov commented Dec 22, 2021 •

edited

Loading

petrochenkov commented Dec 22, 2021 •

edited

Loading

petrochenkov commented Dec 22, 2021 •

edited

Loading

dtolnay commented Dec 22, 2021 •

edited

Loading