Macro matchers only match when they feel like it #27832

jonas-schievink · 2015-08-14T12:39:01Z

...or at least that's how much I currently understand, since macros are really counterintuitive sometimes.

macro_rules! m {
    ( $i:ident ) => ();
    ( $t:tt $j:tt ) => ();
}

fn main() {
    m!(c);
    m!(t 9);  // why does this work, but not the next case?

    m!(0 9);
    // ^ error: expected ident, found 0
}

I don't think this is supposed to happen. Even if it is, the exact rules used for macro matching should definitely be documented somewhere (I think there's not even an RFC).

durka · 2015-08-14T15:37:30Z

It seems like the ident parser in particular is very greedy, or something. For example, changing it from ident to expr eliminates the error (as does switching the order of the rules, but I assume this is reduced from something where you can't do that).

However, fixing this (if it can be fixed) is probably a breaking change :(

jonas-schievink · 2015-08-15T11:40:00Z

Found the culprit: https://github.com/rust-lang/rust/blob/master/src/libsyntax/ext/tt/macro_parser.rs#L514-L522

// this could be handled like a token, since it is one

Do you really think this would break something? I actually don't think it would, since a fix for this should only accept more macro invocations.

jonas-schievink · 2015-08-15T14:51:15Z

On second thought... I think that's not the direct cause, since all other fragments behave roughly the same.

Oh well, back to the drawing board

jonas-schievink · 2015-08-15T22:19:53Z

Okay, so as it turns out, all NT parsers introduce this bug (except tt, which you can't really test). In this particular case, changing from ident to expr only worked because idents and integral literals are both valid expressions. For example, this doesn't work:

macro_rules! m {
    ( $b:expr ) => ();
    ( $t:tt $u:tt ) => ();
}

fn main() {
    m!(3);      // works trivially
    m!(1 2);    // works, since `1` is a valid expression
    m!(_ 1);    // doesn't work, since `_` is not an expression (but a valid TT, of course)
}

Now that this confusion is out of the way, I think I see what happens: When parsing _ as an expression, libsyntax panics because of a syntax error, which aborts compilation (the macro docs state that the parser fully commits to parsing such a nonterminal, so the fact that the third invocation doesn't work is expected behaviour).

However, when parsing 1, it works. But since the 2 token wasn't eaten, the second arm is tried (this is the bug!). It matches, of course, and the macro is accepted.

So, fixing this would indeed be a breaking change, since the intended behaviour is (at least as far as I know), to reject the invocation, but this isn't happening.

DanielKeep · 2015-08-17T18:14:45Z

@jonas-schievink I disagree that attempting the second arm is a bug. Ideally, macro_rules! should attempt each rule until it finds one that matches. From my perspective, the bug is that macro_rules! has no way to recover from failed parse attempts.

Ideally, your last example should go something like this (using ^ to represent cursor positions):

( 3 )
- (^3 ) ; (^$b:expr ) → $b = 3
- ( 3^) ; ( $b:expr^) → matched
( 1 2 )
- (^1 2 ) ; (^$b:expr ) → $b = 1
- ( 1^2 ) ; ( $b:expr^) → input too long; next rule
- (^1 2 ) ; (^$t:tt $u:tt ) → $t = 1
- ( 1^2 ) ; ( $t:tt^$u:tt ) → $u = 2
- ( 1 2^) ; ( $t:tt $u:tt^) → matched
( _ 1 )
- (^_ 1 ) ; (^$b:expr ) → syntax error; next rule.
- (^_ 1 ) ; (^$t:tt $u:tt ) → $t = _
- ( _^1 ) ; ( $t:tt^$u:tt ) → $u = 1
- ( _ 1^) ; ( $t:tt $u:tt^) → matched

jonas-schievink · 2015-08-17T19:12:00Z

Ideally, macro_rules! should attempt each rule until it finds one that matches.

That would indeed be useful! But I think this comment implies that that isn't the intended behaviour:

rust/src/libsyntax/ext/tt/macro_parser.rs

Lines 11 to 13 in c115c51

    
           //! This is an Earley-like parser, without support for in-grammar nonterminals, 
        
           //! only by calling out to the main rust parser for named nonterminals (which it 
        
           //! commits to fully when it hits one in a grammar). This means that there are no

(stating that the macro parser fully commits to NTs implies to me that it doesn't backtrack to try other arms)

DanielKeep · 2015-08-18T03:36:02Z

@jonas-schievink I believe that's referring to how it parses within a rule. Earley parsers can deal with ambiguities by tracking multiple potential parse forests (if I remember correctly; my understanding is a little vague). What it's saying is that it has to commit to parsing a non-terminal (i.e. higher-level productions like expressions) because the parser doesn't have any way to back out of a partial parse. So when it encounters one, it has to parse it, come hell or high water.

Having the macro system not check successive rules once a rule starts matching would be apocalyptic: it would kill damn near every useful, non-trivial macro. We're talking mass hysteria, cats and dogs living together.

jonas-schievink · 2015-08-18T20:28:16Z

I believe that's referring to how it parses within a rule.

Fair enough. In that case the bug is just that the macro expander will panic when it can't parse an NT, so it can't backtrack.

I also managed to dig up #3232, which was closed as "not a bug", but this definitely feels like one.

dylanede · 2017-01-29T23:18:30Z

@jonas-schievink Sorry to dig this up again, but isn't an ident a terminal, so your comments about NTs aren't applicable to this bug?

jonas-schievink · 2017-01-29T23:21:41Z

@dylanede Correct! That's why I mentioned this comment:

// this could be handled like a token, since it is one

(AFAIK, token == terminal)

jeberger · 2017-06-23T07:52:14Z

I got bitten by this bug too. Note that I see two different issues here:

The parser accepts m!(1 2), which according to @jonas-schievink is a bug. Indeed, according to the docs I would have expected the parser to refuse, but I can live with it either way and I can see where fixing the issue would cause code to break.
The parser refuses m!(_ 1), which should be a no-brainer: _ is not an ident, so the first rule should not even begin to match and no backtracking is required to move on to the next rule. Moreover I don't see where fixing this would break anything, since it would only allow code that currently does not compile (although I have no idea how hard fixing this may turn out to be).

Fixes rust-lang#24189. Fixes rust-lang#26444. Fixes rust-lang#27832. Fixes rust-lang#34030. Fixes rust-lang#35650. Fixes rust-lang#39964. Fixes the 4th comment in rust-lang#40569. Fixes the issue blocking rust-lang#40984.

…seyfried Only match a fragment specifier the if it starts with certain tokens. When trying to match a fragment specifier, we first predict whether the current token can be matched at all. If it cannot be matched, don't bother to push the Earley item to `bb_eis`. This can fix a lot of issues which otherwise requires full backtracking (#42838). In this PR the prediction treatment is not done for `:item`, `:stmt` and `:tt`, but it could be expanded in the future. Fixes #24189. Fixes #26444. Fixes #27832. Fixes #34030. Fixes #35650. Fixes #39964. Fixes the 4th comment in #40569. Fixes the issue blocking #40984.

jdm added the A-macros Area: All kinds of macros (custom derive, macro_rules!, proc macros, ..) label Aug 14, 2015

durka mentioned this issue Dec 24, 2015

Add macro lifetime specifier #25509

Closed

durka mentioned this issue Apr 27, 2016

Add parse_generics! and parse_where! macros rust-lang/rfcs#1583

Closed

Phlosioneer mentioned this issue Jun 21, 2016

Macro lifetimes aren't a thing (yet) rust-bakery/nom#268

Closed

jseyfried self-assigned this Jan 12, 2017

lambda-fairy mentioned this issue May 10, 2017

Suggest ! for bitwise negation when encountering a ~ #41722

Merged

kennytm mentioned this issue Jun 26, 2017

Only match a fragment specifier if it starts with certain tokens. #42913

Merged

bors closed this as completed in #42913 Jul 11, 2017

dtolnay mentioned this issue Jun 10, 2018

Unable to fall through past $:lifetime matcher #51477

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Macro matchers only match when they feel like it #27832

Macro matchers only match when they feel like it #27832

jonas-schievink commented Aug 14, 2015

durka commented Aug 14, 2015

jonas-schievink commented Aug 15, 2015

jonas-schievink commented Aug 15, 2015

jonas-schievink commented Aug 15, 2015

DanielKeep commented Aug 17, 2015

jonas-schievink commented Aug 17, 2015

DanielKeep commented Aug 18, 2015

jonas-schievink commented Aug 18, 2015

dylanede commented Jan 29, 2017

jonas-schievink commented Jan 29, 2017

jeberger commented Jun 23, 2017

Macro matchers only match when they feel like it #27832

Macro matchers only match when they feel like it #27832

Comments

jonas-schievink commented Aug 14, 2015

durka commented Aug 14, 2015

jonas-schievink commented Aug 15, 2015

jonas-schievink commented Aug 15, 2015

jonas-schievink commented Aug 15, 2015

DanielKeep commented Aug 17, 2015

jonas-schievink commented Aug 17, 2015

DanielKeep commented Aug 18, 2015

jonas-schievink commented Aug 18, 2015

dylanede commented Jan 29, 2017

jonas-schievink commented Jan 29, 2017

jeberger commented Jun 23, 2017