Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Macro syntax to count sequence repetitions #88

Closed
wants to merge 1 commit into from

Conversation

lilyball
Copy link
Contributor

@huonw
Copy link
Member

huonw commented May 23, 2014

@sinistersnare
Copy link

Off Topic: You can see the rendered view by going to the files changed tab and clicking Rendered on the top of the file. Albeit it is a diff-rendered view, not a regular view.

@lilyball
Copy link
Contributor Author

@sinistersnare You can go to the Files changed tab and click View and you get exactly what I just linked. I put it in my comment just in case people didn't know that.

@bstrie
Copy link
Contributor

bstrie commented May 23, 2014

Under "Alternatives" you say that a count! macro would be limited to accepting expr arguments, but that's not true. The following is a macro (which I've named arity!) which accepts arguments of category expr, pat, ident, ty, and block, along with test cases. I've also implemented an alternative to vec! (the vec_new! macro, below) that properly sets the capacity, as shown in your example.

#![feature(macro_rules)]

macro_rules! vec_new(
    ($($e:expr),*) => ({
        // leading _ to allow empty construction without a warning.
        let mut _temp = ::std::vec::Vec::with_capacity(arity!($($e),*));
        $(_temp.push($e);)*
        _temp
    });
    ($($e:expr),+,) => (vec_new!($($e),+))
)

// Trust me, you *really* want the helper macros...
// If you attempt this without them, may God have mercy on your soul.
macro_rules! arity(
    ($($thing:expr),+) => (arity_expr!($($thing),*));
    ($($thing:pat),+) => (arity_pat!($($thing),*));
    () => (0);
)

macro_rules! arity_expr(
    ($head:expr, $($tail:expr),*) => (1 + arity_expr!($($tail),*));
    ($last:expr) => (1);
)

macro_rules! arity_pat(
    ($head:pat, $($tail:pat),*) => (1 + arity_pat!($($tail),*));
    ($last:pat) => (1);
)

fn test_arity() {
    assert!(arity!() == 0);

    // expr
    assert!(arity!(9) == 1);
    assert!(arity!(9,9,9) == 3);

    // ident - matched by expr
    assert!(arity!(Nine) == 1);
    assert!(arity!(Nine,Nine,Nine) == 3);

    // ty - matched by expr
    assert!(arity!(Option::<int>) == 1);
    assert!(arity!(Option::<int>,Option::<int>,Option::<int>) == 3);

    // block - matched by expr
    assert!(arity!({9;9}) == 1);
    assert!(arity!({9;9},{9;9},{9;9}) == 3);

    // pat
    assert!(arity!(foo @ _) == 1);
    assert!(arity!(foo @ _,foo @ _,foo @ _) == 3);

    println!("All tests pass");
}

fn main() {
    test_arity();
    let v = vec_new![7,8,9];
    println!("{}", v);
}

I can't think of any cases where this wouldn't suffice. True, arity! expands to something like 1 + 1 + 1, but surely LLVM does constant folding. :)

@bstrie
Copy link
Contributor

bstrie commented May 23, 2014

There is no general alternative for counting the iteration number.

Eh, I don't know about that:

#![feature(macro_rules)]

macro_rules! iota(
    ($($thing:expr),+) => (iota_expr!(1, $($thing),*));
    ($($thing:pat),+) => (iota_pat!(1, $($thing),*));
    () => (());
)

macro_rules! iota_expr(
    ($id:expr, $head:expr, $($tail:expr),*) => ({
        println!("expr {}", $id); iota_expr!($id+1, $($tail),*);
    });
    ($id:expr, $last:expr) => ({ println!("expr {}", $id); });
)

macro_rules! iota_pat(
    ($id:expr, $head:pat, $($tail:pat),*) => ({
        println!("pat {}", $id); iota_pat!($id+1, $($tail),*);
    });
    ($id:expr, $last:pat) => ({ println!("pat {}", $id); });
)

fn test_iota() {
    println!("empty test");
    iota!();

    // expr
    println!("expr test");
    iota!(9);
    iota!(9,9,9);

    // ident - matched by expr
    println!("ident test");
    iota!(Nine);
    iota!(Nine,Nine,Nine);

    // ty - matched by expr
    println!("ty test");
    iota!(Option::<int>);
    iota!(Option::<int>,Option::<int>,Option::<int>);

    // block - matched by expr
    println!("block test");
    iota!({9;9});
    iota!({9;9},{9;9},{9;9});

    // pat
    println!("pat test");
    iota!(foo @ _);
    iota!(foo @ _,foo @ _,foo @ _);
}

fn main() {
    test_iota();
}

Though this pattern does get mightily inconvenient, mighty fast. The idea of iinitializing statics that utilize the current iteration number is a tantalizing problem, but I'm not sure if it's possible using this approach.

@lilyball
Copy link
Contributor Author

@bstrie It didn't occur to me that you could have the same macro match multiple different types of nonterminals, although I would not be surprised if you can trigger parse errors that way (e.g. the first arg could match as an expression, so it does, but then the second arg is a real pattern and can't match as an expr).

As for iota!(), that looks rather tortured, and it only works for macros with a single sequence. It won't work for anything that uses multiple sequences.

@lilyball
Copy link
Contributor Author

Yeah, your arity!() macro can't handle arity!(None, Some(_)), because it thinks it's doing expressions and I really wanted patterns.

@huonw
Copy link
Member

huonw commented May 23, 2014

Would something like arity!(pat: None, Some(_)) arity!(expr: None, Some(1)) be unacceptable?

@lilyball
Copy link
Contributor Author

@huonw That seems extremely hacky to be exporting from libstd.

@huonw
Copy link
Member

huonw commented May 23, 2014

It seems far more controlled (and less hacky) than an count macro that just takes arbitrary arguments without a "type" (e.g. arity!(tt: foo bar (whatever) 1 2 "3" 4) would work).

However, I think I still prefer a proper repeat-count for $-nonterminals to that.

@paulstansifer
Copy link

I think we could implement arity!(...) as a procedural macro that traverses its token-tree arguments, looking for commas. It would be accurate, provided that they're aren't any constructs in Rust with "naked" (non-delimiter-enclosed) commas, and it would be fast (faster than the arity!(...) proposed above, but not as fast as $#(...)).

If we were willing to squirrel information away into the parse_sess, I think we could even make a built-in procedural macro for $(#), which we could call repeat_index!(). I'm not sure how the community feels about built-in procedural macros that require special support from the compiler, but it's always nice not to have to introduce new punctuation syntax.

@lilyball
Copy link
Contributor Author

@paulstansifer Do you feel that a procedural macro would be the better approach? I feel like that's trying to paper over a deficiency in the macro syntax. Better to just fix macros to be able to count the sequences directly, then to try and recreate that ability externally.

@SimonSapin
Copy link
Contributor

Servo could use this. For now, we have a Python template that generates Rust code equivalent to a struct with a bunch of bool fields, but more compact (the storage is [uint, ..MAGIC_NUMBER]).

The plan is to replace the template with procedural macros, but this RFC would allow that particular bit to be a much simpler macro_rules! macro.

@paulstansifer
Copy link

@kballard At least for arity! I do. If the semantics can be accomplished through "normal" mechanisms, by all means let us use those mechanisms. I'm more conflicted about repeat_index!, seeing as it is more like "magic", but even that depends on whether we're willing to see parse_sess as an API for procedural macros to use. (disclaimer: I haven't looked at the new, exciting, user-defined procedural macros yet to see whether they get access to parse_sess)

@paulstansifer
Copy link

@SimonSapin Unfortunately, I doubt that you'll be able to write [uint, ..arity!($($arg),*)] any time soon, because I believe the spot after the .. is (from the parser's point of view) not an expression but a plain number.

I have an idea for a change we could make to at least allow [uint, ..$size], where $size is an argument to a helper macro that gets passed arity!(...), but it would probably require RFCing nonetheless. I guess that maybe implementing macros-in-number-position would probably be easier, so maybe we should do that instead?

@SimonSapin
Copy link
Contributor

I don’t know how the parser works exactly, but this compiles fine:

static A: uint = 42;
static B: [u8, ..A] = [0, ..A];

@paulstansifer
Copy link

@SimonSapin Oh, sure enough! It's syntactically an expression (handled by maybe_parse_fixed_vstore). Forget what I said earlier, arity!(...) would work fine there.

@lilyball
Copy link
Contributor Author

Using a hack to try and count the number of repetitions seems distinctly sub-par to just letting the macro parser count them itself, since the macro parser should already know the answer. It also seems more fragile (e.g. that multi-typed arity!() that was pasted can be tricked into breaking). And it increases the set of global macro names that are taken.

@paulstansifer
Copy link

@kballard The solution that I proposed shouldn't be fragile at all. I admit that it's a little funny to essentially say "Paste this in, separated by commas, and then count the commas", but I don't think it's too bad. And name pollution is exactly my concern; instead of claiming a punctuation sequence for the concept, we can just use a descriptive name, freeing # to be used for something that can't be implemented as a macro at all. Furthermore, a name is something that users can look up in documentation and search for online.

That said, I am sympathetic to the idea of just acquiring the information directly, since it'd be shorter to write. But I really like the idea of implementing a feature without having to change the language.

@pnkfelix
Copy link
Member

@kballard Is there an example of a case where $#($x) can express something that arity!($x, ...) could not express?

E.g. the discussion above has led me to think that arity! could be used to accomplish the goals here, and the main argument for a specialized syntax like $#($x) over providing a macro form like arity! is just that the former is shorter to type. Is that accurate?

If so, I recommend you revise the RFC so that this feature is provided via a macro like arity!(x, ...), because I think that will ease the road to acceptance.

@lilyball
Copy link
Contributor Author

@pnkfelix arity!() has to be custom-implemented for every single nonterminal type, it behaves unexpectedly if you try and use it (for whatever reason) with expressions instead of $nonterminal tokens (e.g. arity!(None, Some(_)) as mentioned before), and it also expands to a sequence of 1+1+1+1+1+...+0 which makes it only suitable for use where expressions are allowed. As a contrived example, it can't possibly work with the following:

match count {
    Some(arity!($($arg),*)) => {
        println!("the count matches the args");
    }
    _ => {
        println!("wrong arg count");
    }
}

However $#($arg) would work just fine there, as it would evaluate to an unsuffixed integer literal.

@rkjnsn
Copy link
Contributor

rkjnsn commented Jun 25, 2014

Those limitations are just for the macro_rules! version, right? What about an implementation using a procedural macro as suggested by paulstansifer?

@lilyball
Copy link
Contributor Author

@rkjnsn You're right, I was only thinking of the macro_rules! version.

Still, I think that the idea of using arity!($($arg),*) is rather hacky, both because it's reserving the global macro name arity!() for something that's only for specialized use cases inside of other macros, and because it's trying to reconstruct knowledge that the macro expander already has.

It also just plain seems weird. For it to work on all nonterminal types, it needs to work on the raw token stream, and literally just count the number of commas and add 1. That's the only way it would work with e.g. the tt nonterminal. And it seems like it's being rather dependent upon the current implementation of macro_rules!(), and would break completely if e.g. it was decided that it made sense to actually reify a new token stream from each nonterminal instead of merely using token::INTERPOLATED(token::Nonterminal).

Basically, the arity!() suggestion seems like a workaround for the lack of $#(), and I'd rather just have $#().

@SimonSapin
Copy link
Contributor

arity!() only covers half of what’s proposed in this RFC (“how many repetitions a $() sequence will expand to”). I’d also like to have “the current iteration number inside of a $() sequence”. See my earlier message for a use case.

@SimonSapin
Copy link
Contributor

To clarify: I don’t care about what it looks like (custom syntax or built-in macro) as long as both the total count and the current iteration number can be obtained somehow.

@pnkfelix
Copy link
Member

I can appreciate the argument that its bad to reserve the global name arity!, though I do wonder if future revisions to the macro system to better integrate it with the module system could alleviate that.

cc @jbclements who might be able to weigh in with some perspective on where the macro system is going.

I will have to review the procedural macro system a bit more before I can comment further on @kballard's other point.

(And I will also have to play around a little with the current iteration number desiderata. That does indeed seem important.)

@kballard : In the meantime, it would be good to add an motivating example for $(#) to the RFC.

@jbclements
Copy link

In my mind, it looks like adding $(#) is a substantially deeper addition to the existing system than adding arity!; as an example, it's pretty clear to me how a macro scoping system could allow you to get arity! only when you want it, but it sounds much hairier to design a macro scoping system that could turn things like $(#) on and off.

@paulstansifer
Copy link

@kballard It'll never make sense to reify the INTERPOLATED tokens: the performance cost would be high, and it would buy nothing.

I agree that "comma-separate, then count commas" is kinda weird, but I argue that reserving a name is strictly better than reserving a punctuation character. After all, there are tons of possible names, but not a lot of punctuation. What's more, arity! is a great description of what the macro does, so in addition to being easier to look up in documentation, it's less likely to need to be looked up.

@lilyball
Copy link
Contributor Author

@paulstansifer

It'll never make sense to reify the INTERPOLATED tokens: the performance cost would be high, and it would buy nothing.

That's certainly plausible, but I don't like making a macro that relies upon the current implementation details of nonterminal tokens.

I agree that "comma-separate, then count commas" is kinda weird, but I argue that reserving a name is strictly better than reserving a punctuation character.

The $#() syntax would only be special inside of a macro_rules! definition. And macros already basically reserve tokens starting with $ as special, so I'm not worried about $# being given new meaning outside of macros.

What's more, arity! is a great description of what the macro does, so in addition to being easier to look up in documentation, it's less likely to need to be looked up.

I disagree. It's actually a bad name. It's not actually finding the arity of anything at all, it's just a hack that tries to count nonterminals by counting commas. Offhand, I'd expect something like arity!() to take a value of a function type and return the arity of the function (of course, macros are syntactic, so that wouldn't exactly work, but the point is that's what the name "arity" suggests). Note of course that the word "arity" does not mean the number of arguments that are given to a function, but instead are the number of arguments that a function/operator accepts. So it's a function of the type of the function/operator, not of the invocation.

@pnkfelix
Copy link
Member

@kballard do you think there are more points that need to be discussed here?

Assuming not, would you like to attempt to update the RFC in the manner I outlined earlier? Or should we perhaps plan to postpone this work for post 1.0, since macro_rules is feature-gated for 1.0 anyway?

@lilyball
Copy link
Contributor Author

@pnkfelix I don't think there's anything more to discuss. I've actually had this page open on my main computer for a few weeks, to remind me to get around to updating the RFC, I just haven't had the time lately. I don't think I'll have time tonight either, but I'll try to get to it sometime this week.

My biggest concern with postponing this to after 1.0 is that vec![] still doesn't use the element count as a capacity, which is an obvious optimization, and we've rejected ad-hoc solutions in favor of waiting for the more general solution. If we don't have the more general solution for 1.0, then either vec![] will continue being a bit stupid, or we'll end up with an ad-hoc solution.

@sfackler
Copy link
Member

I think we're free to mess with vec!'s internals post-1.0 all we like as long as we don't break any users of it, which I can't imagine we would.

@SimonSapin
Copy link
Contributor

Looks like rust-lang/rust@02d976a fixed the performance of vec![] without counting macro arguments. (std::slice::BoxedSlice::into_vec does not allocate.)

@mahkoh
Copy link
Contributor

mahkoh commented Oct 13, 2014

NB: There is currently no way to get the repetition number (without writing a compiler plugin.)

@pnkfelix
Copy link
Member

@kballard given SimonSapin's observation from yesterday, I am inclined to postpone this RFC to be addressed post 1.0. Do you think that's reasonable?

@lilyball
Copy link
Contributor Author

@pnkfelix There's still plenty of use for this RFC. For example, there's a macro in the compiler (at least one, but possibly more than one) that could really benefit from this RFC (IIRC it wants the repetition number feature). And I seem to recall a couple of months ago having code in a personal project that could have used the repetition number feature as well.

That said, if it's not clear that we want to take this yet, postponing to after 1.0 is acceptable. The vec![] macro was the most important use-case, and that no longer applies.

@pnkfelix
Copy link
Member

(unassigning self; I think we should close as postponed.)

@pnkfelix
Copy link
Member

@kballard The team agress that there is definitely utility in this feature. But we don't think there is a pressing need to put this extension in before 1.0.

So we are closing as postponed.

@pnkfelix pnkfelix closed this Oct 23, 2014
@pnkfelix
Copy link
Member

(p.s. @kballard thanks for all the effort you put into this. I really do think we will get something good out of this process once we can come back around to addressing this feature.)

@Stebalien
Copy link
Contributor

Yeah, your arity!() macro can't handle arity!(None, Some(_)), because it thinks it's doing expressions and I really wanted patterns.

@kballard for completeness, that's easily fixed. The following seems to work for everything but item and meta (it allows mixed lists but there shouldn't be any ambiguity):

macro_rules! arity {
    ($e:tt, $($toks:tt)+) => { 1 + arity!($($toks)+) };
    ($e:tt) => { 1 };

    ($e:pat, $($toks:tt)+) => { 1 + arity!($($toks)+) };
    ($e:pat) => { 1 };

    ($e:expr, $($toks:tt)+) => { 1 + arity!($($toks)+) };
    ($e:expr) => { 1 };

    () => { 0 };
}

@alkis
Copy link

alkis commented Oct 14, 2015

@kballard are you still pursuing this? I am in need of the repetition index in some macros I am writing and there is currently no way of getting this in the current state of macros. Is the discussion done here folded into some other rfc that would provide such functionality in some other way? Or do you have plans of pursuing this rfc further?

@Stebalien
Copy link
Contributor

@alkis it's possible but you need to use recursion (you can't do it with the $(...)* syntax).

@alkis
Copy link

alkis commented Oct 14, 2015

@Stebalien if you use recursion the repetition index will not be a literal but an expression. The index of the N'th item will expand to something like 0 + 1 + 1 + ... + 1. If you want the index to be used for accessing an element of a tuple unfortunately an expression won't work. AFAIK only a literal works.

@Stebalien
Copy link
Contributor

@alkis I see. No, this doesn't work with tuples etc.

@alkis
Copy link

alkis commented Oct 20, 2015

So what needs to be done to advance this RFC?

@bwo
Copy link

bwo commented Jun 27, 2017

is this rfc abandoned? superseded?

@SimonSapin
Copy link
Contributor

I think procedural macros rust-lang/rust#38356 will make this largely unnecessary.

@Stebalien
Copy link
Contributor

@SimonSapin Procedural macros are generally a heavy-weight solution and counting is something I've needed quite often in declarative macros (although building a 0 + 1 + 1 ... expression is usually sufficient).

@SimonSapin
Copy link
Contributor

At least building this as a procedural macro outside of the standard library would make it easier to make a case for inclusion in std in a new RFC.

@Stebalien
Copy link
Contributor

Oh, I see. Good point!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.