Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Amendment to RFC 439 for grammar ambiguity #498

Closed
wants to merge 1 commit into from

Conversation

aturon
Copy link
Member

@aturon aturon commented Dec 5, 2014

This amends the cmp/ops reform RFC to deal with the grammar ambiguity around ..j notation.

@aturon aturon mentioned this pull request Dec 5, 2014
@nrc
Copy link
Member

nrc commented Dec 5, 2014

+1

@sfackler
Copy link
Member

sfackler commented Dec 5, 2014

Should we yank i.. as well then? Seems kind of weird to have one but not the other.

@aturon
Copy link
Member Author

aturon commented Dec 5, 2014

@sfackler

Should we yank i.. as well then? Seems kind of weird to have one but not the other.

I agree that it's a bit asymmetric, but i.. has much more of an ergonomic impact, since it saves you having to get at the length explicitly. It's also a nice shorthand for the typical use of iter::count (and I suspect other uses as well).

I believe it would be backwards-compatible to add i.. notation later on, though, so we could consider not having it for 1.0.

@tikue
Copy link
Contributor

tikue commented Dec 5, 2014

While I agree that ..i isn't terribly useful, the asymmetry is still an ugly wart. I'd prefer to change the syntax for fixed-size array literals. Could the revisions address why changing the syntax for fixed-size array literals is not being considered?

@aturon
Copy link
Member Author

aturon commented Dec 5, 2014

@tikue The short answer is: the existing syntax emerged only after a very long bikeshed, and time until 1.0 is very short.

@tikue
Copy link
Contributor

tikue commented Dec 5, 2014

The bikeshed wasn't aware of the conflict with the Range proposal, since the Range proposal occurred months later. I'm fairly confident that bikeshed would have been painted a different color had this conflict been predicted then.

@ftxqxd
Copy link
Contributor

ftxqxd commented Dec 5, 2014

I’m not 100% sure that I like this way of resolving the ambiguity. It just makes the syntax inconsistent: people expect to be able to write foo[..bar] from seeing foo[bar..] and foo[bar..baz]. The problem with the ambiguity could be minimally resolved by disallowing ..bar expressions as the final element in two-element array literals, but that would be inconsistent and generally strange. I think that the best way to resolve this ambiguity would be (unfortunately) to disallow range literals that aren’t directly inside the [] of indexing operations. That would stop one from being able to do for i in 0..100. Python does something similar: foo[start:end] is just sugar for foo[slice(start, end)], but start:end is not valid anywhere else. (Python actually goes even further, allowing foo[start:end:step, start2:end2:step2, ...] as sugar for foo[(slice(start, end, step), slice(start2, end2, step2), Ellipsis)].) The advantage of this way of resolving the ambiguity is that it’s still possible to later (after 1.0) allow (some subset of) the slicing syntax in normal expressions. Python did something similar with its .../Ellipsis desugaring: in Python 2, ... was only allowed in indexing expressions; in Python 3, it’s now allowed everywhere.

@sinistersnare
Copy link

If the .. syntax could be ambiguous, we could look towards python and maybe use : for range syntax?

It would make range literals weird though.

@tikue
Copy link
Contributor

tikue commented Dec 5, 2014

@P1start that completely defeats the purpose of this particular RFC: that a..b is sugar for a Range struct, explicitly not slicing sugar.

@ftxqxd
Copy link
Contributor

ftxqxd commented Dec 5, 2014

@sinistersnare : has already been planned for type ascription (e.g., frob(iter.collect(): Vec<int>)), so that would still be ambiguous.

@sinistersnare
Copy link

Ah yes I forgot about that. Damn.

@tikue No it does not, this PR is to remove syntax due to ambiguity, and I was trying to resolve it

@tikue
Copy link
Contributor

tikue commented Dec 5, 2014

@sinistersnare my comment was directed at @P1start

@sinistersnare
Copy link

Ah ok :)

@ftxqxd
Copy link
Contributor

ftxqxd commented Dec 5, 2014

@tikue I wouldn’t say that the entire point of that part of the RFC was to make a..b sugar for a Range struct. I would say that the main advantage given by that part of the RFC is the removal of the Slice traits. Allowing a..b only in slice patterns makes sense to me because you still get to unify the Index and Slice traits while at the same time avoiding any ambiguities created from allowing a..b in arbitrary expressions. a..b wouldn’t become slicing sugar; it would still be a range expression, just one that is only allowed in indexing expressions.

@tikue
Copy link
Contributor

tikue commented Dec 6, 2014

Ok, from that perspective it makes more sense. Still, going from the inconsistency of allowing a.. but not ..a to the inconsistency of allowing v[..a] but not for i in range ..a { ... } doesn't seem like a win.

@SiegeLord
Copy link

The array syntax was always terrible, I don't see what 1.0 being close has to do with resigning to supporting it forever. Changing it a mere syntactic change unlike dozens of other features that are changed/introduced now which have semantic components.

@tikue
Copy link
Contributor

tikue commented Dec 6, 2014

Agreed. The cons of changing array literal syntax now are:

  • opens the door for another long and frustrating bikeshed
  • breaks tons of code (though could be trivially fixed with a script)

The pros are:

  • fixes an ugly syntax
  • allows for consistency in the range syntax

The benefits of the pros will outlive the short term pain of the cons.

@glaebhoerl
Copy link
Contributor

I personally think the array syntax is nice enough, and I'd have a hard time coming up with a better one. I assume the people saying we should change it do have a particular syntax in mind? But debating it here would unduly derail this thread, so the best way forward would probably be to start a bikeshedding thread on discourse, and then come back with it here if there's consensus around some better option.

@quantheory
Copy link
Contributor

Using the full range notation here does not seem too onerous, since it involves only writing an extra 0.

This may be true in the standard library, but it is not obviously true in general. In the original RFC 439, the different Range type parameters are not required to be integers. IIUC, this implies that you could have Index implementations to let you do things like:

  • dictionary[..'antelope']
  • three_d_array[(2, 2, 2)..]
  • week[Tuesday..Thursday]

Removing RangeTo does constrain such applications of this syntax, by forcing you to know and explicitly specify an appropriate lower bound:

  • dictionary['a'..'antelope']

(Note also that if such applications of slicing syntax are allowed, the [] syntax becomes much more useful for similar reasons. For ordered collections where consecutive values are not consecutive in memory, or multidimensional arrays, the value returned by slicing will almost definitely not be a standard slice, but rather some struct providing its own interface to view/mutate a subset of the container. foo[] in this case becomes a clear way of creating such a struct over the whole collection, without knowing the bounds on the collection, and where using the * operator to mean "create a slice/view object" would be confusing.)

@aturon
Copy link
Member Author

aturon commented Dec 9, 2014

@quantheory Those are some excellent points.

FWIW, if some clever person wants to devise a better solution that retains fulls symmetry and removes the ambiguity, I'd be all ears! I tried to stay away from revising the fixed array syntax, since that was apparently the result of a massive bikeshedding effort, but perhaps things have changed since then.

Given that this PR is entirely about a syntactic change, I don't personally have a problem with such discussion happening right here.

@mdinger
Copy link
Contributor

mdinger commented Dec 9, 2014

If you mean rust-lang/rust#9879, the reason they were reevaluating the array syntax (as the bug notes) is because it's ugly. Not because it's conflicting. They also didn't find anything much better or have a dire need.

@tikue
Copy link
Contributor

tikue commented Dec 9, 2014

I am in favor of anything that can resolve the ambiguity and is not obviously worse than the current fixed-array syntax . I prefer reusing symbols if possible. It'd be nice if something could be chosen that would look equally nice for the type syntax.

[x]{n}
[x #n]
[x, ||n||] // magnitude `n`
[x in n] // in `n` slots
[x @ n] // at `n` slots
[x, for n] // for `n` slots 
[x => n] // x goes to `n` slots

I haven't thought hard about these; some probably have obvious issues.

@aturon
Copy link
Member Author

aturon commented Dec 9, 2014

@mdinger

If you mean rust-lang/rust#9879, the reason they were reevaluating the array syntax (as the bug notes) is because it's ugly. Not because it's conflicting. They also didn't find anything much better or have a dire need.

No, I was referring to a long mailing list thread that @nikomatsakis showed me, which I think predated this issue.

@ftxqxd
Copy link
Contributor

ftxqxd commented Dec 9, 2014

Perhaps we could just do away with the array repeat syntax altogether. It can’t be used for much other than arrays of scalars (due to how it only works for Copy types), and if we had consts in generic parameters, we wouldn’t need it anyway, and there could even be a generalised function that uses Clone instead of Copy (because it wouldn’t even be part of the language). We could provide a replacement macro for the time being:

macro_rules! repeat {
    ($e: expr, $n: expr) => {{
        let r: [_, ..$n] = ::std::mem::uninitialised();
        let e = $e;
        // Do some unsafe magicks here, copying e (perhaps with .clone())
        r
    }}
}

Unfortunately, that wouldn’t work in constant expressions unless we got CTFE. And constant expressions are a pretty common use case for the array repeat syntax.

@aturon
Copy link
Member Author

aturon commented Dec 9, 2014

@tikue I could live with several of those, personally. :-)

@netvl
Copy link

netvl commented Dec 9, 2014

-1. I don't like the asymmetry; it would be very surprising that a.. works but ..a doesn't. If I had to choose between indexing and array literals, I'd choose indexing.

@Kimundi
Copy link
Member

Kimundi commented Dec 9, 2014

Hm, thinking a bit more about @tikue 's proposals:

struct T;
const N: uint = 5;

let a: [T]{N} = [T]{N}; // Seems verbose, might be ambiguous with one-element array expressions?
let a: [T #N] = [T #N]; // This seems like a decent option
let a: [T, #N] = [T, #N]; // Might be ambiguous with attributes-on-expressions
let a: [T, ||N||] = [T, ||N||]; // I think this one is too verbose :P
let a: [T in N] = [T in N]; // Might be ambiguous with the new placement expression proposal
let a: [T @ N] = [T @ N]; // Might be a problem if we ever want to make @ a prefix operator again
let a: [T, for N] = [T, for N]; // I like his one, but it might be a parsing ambiguity with for-loop expressions.
let a: [T for N] = [T for N]; // This one (without comma), I think, does not have a parsing ambiguity
let a: [T => N] = [T => N]; // Feels a bit verbose, and looks like matching or mapping.

I think of these, the for one reads best, followed by #. @ and in would also look nice, but read too much like "inside" rather than "repeated N times".

@mdinger
Copy link
Contributor

mdinger commented Dec 9, 2014

Also:

let a: [T by N] = [T by N]; // Probably closes to nice `[T * N]` which @bjz noted 
let a: [T, by N] = [T, by N];
let a: [T at N] = [T at N];
let a: [T, at N] = [T, at N];
let a: [T dup N] = [T dup N]; // duplicate
let a: [T, dup N] = [T, dup N];
let a: [T times N] = [T times N];
let a: [T, times N] = [T, times N];
let a: [T rep N] = [T rep N]; // repeat
let a: [T, rep N] = [T, rep N];

Note: @bjz's note

@netvl
Copy link

netvl commented Dec 9, 2014

@mdinger, I think the idea was not to introduce new keywords.

Anyway, shouldn't this be discussed in a separate RFC?

@mdinger
Copy link
Contributor

mdinger commented Dec 9, 2014

I was not aware of that.

@aturon
Copy link
Member Author

aturon commented Dec 9, 2014

@netvl @mdinger

We could potentially introduce a new keyword if it was strongly justified. I also am happy for bikeshedding to happen here, because this RFC amendment is purely about syntax.

@glaebhoerl
Copy link
Contributor

There's also no reason why the T necessarily has to come before the N.

@Kimundi
Copy link
Member

Kimundi commented Dec 9, 2014

Heh, another idea (It won't parse because of the Or operator however):

let a: [T | N] = [T | N];

@mdinger
Copy link
Contributor

mdinger commented Dec 9, 2014

These could extend to multiple dimensions if that was ever an explicit goal:

// Using `for`
let a: [T for N by M by W] // If this made sense:
// These 3 are equivalent:
let a: [T for N by 1]
let a: [T for 1 by N]
let a: [T for N]// same as suggested by @kimundi

// Flipped and using `of`
let a: [N by M by W of T]// If this worked
// These 3 are equivalent
let a: [N by 1 of T]
let a: [1 by N of T]
let a: [N of T]

@tikue
Copy link
Contributor

tikue commented Dec 9, 2014

In terms of more radical changes:

[x for _ in CONST_RANGE]

, which is extensible to

[CONST_EXPR( for PAT in CONST_RANGE)+]

@sfackler
Copy link
Member

I would personally prefer range literals to only be allowed in [] to the removal of one of the half open variants. for i in 1..10 is cute but not really worth the inconsistency IMO. I'd basically be treating the Range, RangeFrom, etc structs implementation details of the slice syntax.

If we're up for bikeshedding a change to the array syntax, why not axe the [X, ..N] syntax in expression contexts all together? I believe a macro that expands array![X, ..N] to just [X, X, X, ......] will work in all contexts that currently accept the current literal format, and also avoids some of the annoyances with the current implementation with respect to NoCopy types. It'd I think have to be a procedural macro, but that's fine if we bake it in.

@ftxqxd
Copy link
Contributor

ftxqxd commented Dec 13, 2014

@sfackler Expanding array![X, ..N] to array![X, X, X, X, ...] isn’t really practical if X is a function call or anything that could be expensive to calculate. To avoid that, it’d have to use a temporary, but constant expressions currently only support temporaries that are also consts, and consts require type annotations, so I can’t see a way of making that work. Also, the N could be an arbitrary constant expression (e.g., another const), and they can’t be resolved at expansion time. See my comment above for the closest I could get to a macro that works in constant expressions that also could be made to accept any constant repeat parameter.

@quantheory
Copy link
Contributor

My preference would be, in this order:

  • Use a keyword for the array case ([X for N] if it doesn't cause an ambiguity itself, or introducing a keyword if necessary; prepositions like by are not common variable or method names, so they tend to be better keywords, I think).
  • New uses of symbols for the array case (e.g. [X # N], which is OK, but distracts me slightly because I mentally associate # with comments or directives).
  • Change the range syntax to use some other symbol than ...
  • Make range syntax only work when indexing.
  • Removing the ..j slicing syntax.

I thought that the two uses of .. were odd even before I heard about this ambiguity, because IMO duplication and slicing/ranges are not semantically close enough to warrant using the same symbol. I think that the comments here also make a decent case for keeping RangeTo if possible.

At the same time, the only reason for keeping the current array repeat syntax seems to be that we already have something implemented, and there are so many acceptable alternatives that we won't be able to choose an alternative. (That's what bike-shedding means, after all!) It would be a shame if the intuitiveness (and to some extent functionality) of slicing syntax was sacrificed in order to preserve an array repeat syntax that is not bad, but still basically arbitrary and not set in stone pre-1.0.

@glaebhoerl
Copy link
Contributor

FWIW, I think @mdinger's idea [5 of 'a']: [5 of char] reads very nicely, and is even nicer than the current syntax, unlike the others so far.

@nrc
Copy link
Member

nrc commented Dec 13, 2014

Of the fixed length array syntaxes, I prefer [expr for constexpr] the most - it does not add another keyword and bares some likeness to other languages, in particular Python. I would prefer this to be a macro, but I agree that doesn't seem possible. However, I still don't mind the asymmetry too much. I believe that it will be uncommon to have any range where the lower limit is not obvious.

Given the time frame, I think we need to move quickly on any syntax change. There seems to be broad agreement here at least that changing the syntax of repeating arrays is better than losing symmetry in ranges. Could somebody write up a short, strawman RFC for such a change (plus changes) so we can see if there is resistance to the general idea and/or if a wider audience has opinions/ideas on the exact syntax.

I can implement whichever change we choose (I am currently working on the range/slicing changes), but we need to reach a decision very quickly if the implementation work is going to be done by Jan 9th.

@quantheory
Copy link
Contributor

I like the version using "of" and am writing up an RFC for it. I have changed actually changed my mind a bit about using "for" even though it avoids a new keyword; I'll explain that in the RFC.

@aturon
Copy link
Member Author

aturon commented Dec 18, 2014

I'm going to close this RFC in favor of changing the fixed-size array syntax. Thanks for the feedback!

@aturon aturon closed this Dec 18, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.