Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add grammar for char, string, byte, and raw literals #121

Merged
merged 3 commits into from
Sep 26, 2017

Conversation

brauliobz
Copy link
Contributor

Part of #84.

@ehuss
Copy link
Contributor

ehuss commented Sep 25, 2017

Now that underscores are accepted in unicode escapes in Rust 1.22, should the UNICODE_ESCAPE production include underscores? Maybe something like this?

UNICODE_ESCAPE :
    \u{ HEX_DIGIT _* }
  | \u{ HEX_DIGIT _* HEX_DIGIT _* }
  | \u{ HEX_DIGIT _* HEX_DIGIT _* HEX_DIGIT _* }
  | \u{ HEX_DIGIT _* HEX_DIGIT _* HEX_DIGIT _* HEX_DIGIT _* }
  | \u{ HEX_DIGIT _* HEX_DIGIT _* HEX_DIGIT _* HEX_DIGIT _* HEX_DIGIT _* }
  | \u{ HEX_DIGIT _* HEX_DIGIT _* HEX_DIGIT _* HEX_DIGIT _* HEX_DIGIT _* HEX_DIGIT _* }

@brauliobz
Copy link
Contributor Author

should the UNICODE_ESCAPE production include underscores? Maybe something like this?

Cool! I wasn't aware of that :). Yes, that would be the way I would include it in the grammar.

So... I am against including this right now. The problem in including this change is that I'm testing the grammar on stable to avoid having such a fast moving target and to avoid including in it things that maybe will took too long or never reach stable at all (since in my understanding lots of things on nightly are experimental).

Also, correct me if I am wrong: the Rust 1.22 you talk about is actually nightly, right? It doesn't mean this will be released on Rust 1.22, right? The underscore in Unicode escapes isn't available even in beta yet (Rust 1.21.beta3).

@ehuss
Copy link
Contributor

ehuss commented Sep 25, 2017

Rust 1.22 you talk about is actually nightly, right?

Nightly, yes.

BTW, do you want some help on this? I'd be willing to help if you could point me to things to add/fix. I also might have some questions about the grammar you have so far, and am curious how you're testing it.

@Havvy
Copy link
Contributor

Havvy commented Sep 26, 2017

The underscores in Unicode escapes is insta-stable (there's no feature gate) and the reference rides the release trains, so it should be included in this grammar. It can be done in a separate PR if that would help.

Copy link
Contributor

@Havvy Havvy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, minus the underscores from other comments. I can merge this now if you don't want to add those into this commit, otherwise I can re-approve and merge after those are added.

@brauliobz
Copy link
Contributor Author

@Havvy

LGTM, minus the underscores from other comments

I've added them. Shouldn't they already be on beta, then? Or is it too soon?

@ehuss

BTW, do you want some help on this? I'd be willing to help if you could point me to things to add/fix.

Yeah, that would be great. But I have too many things half done... I will need to take a look at what I have and commit it.

I also might have some questions about the grammar you have so far

I'm on IRC sometimes, on #rust-docs .

and am curious how you're testing it.

I'm reading the rustc's parser/lexer, making an Antlr4 grammar out of it, and making some test cases and testing them against my grammar and rustc.

@brauliobz
Copy link
Contributor Author

Now it is a little too long. I didn't want to use {n, m} notation but maybe

\u{ ( HEX_DIGIT _* ){1,6} }

would be a better way to do this? Or even

\u{ ( HEX_DIGIT _* )1..6 }

@Havvy
Copy link
Contributor

Havvy commented Sep 26, 2017

Yeah. It's literally one character too long when I render it, and spilling onto the next line in a confusing way. 😕 I don't care whether it's {n,m} or n..m, or even 1 to 6 repetitions just as long as it conveys what is going on. It might be useful to describe the notation used by the grammar in lexical-structure.html including any interesting things you use.

@Havvy Havvy merged commit ccd7dad into rust-lang:master Sep 26, 2017
@Havvy
Copy link
Contributor

Havvy commented Sep 26, 2017

💟 Thanks.

@brauliobz brauliobz deleted the grammar_char_string_literals branch October 2, 2017 12:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants