Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Syntax for non-standard identifiers #32408

Merged
merged 5 commits into from
Aug 16, 2019
Merged

Conversation

c42f
Copy link
Member

@c42f c42f commented Jun 25, 2019

Introduce syntax for non-standard identifiers. Features:

Various syntax choices were explored (see below), with var"ident" currently chosen. Though this has the familiar visual form of a string macro, special parser support was required to make it valid in every context where a normal identifier is allowed. Therefore, this is technically a breaking change. However, the use here has exactly the same semantics as the use in RCall which is the only package I could find on github which defines @var_str.

show has been changed to use this syntax as necessary when printing expressions:

  • Fixes Printing of macro symbols displays like a macrocall #32354 by printing macro identifiers using var"@ident" rather than as a macro call @ident.
  • Fixes printing of symbols generated by macro hygiene, for example var"#89#err" rather than the bare #89#err
  • Simplify printing of functions and type names which are generated by the compiler.

@c42f c42f added the bugfix This change fixes an existing bug label Jun 25, 2019
@c42f
Copy link
Member Author

c42f commented Jun 25, 2019

Ugh, tests are failing due to a somewhat dubious Symbol hack in Test.

@JeffBezanson
Copy link
Sponsor Member

Maybe it's time to add:

macro sym_str(str)
    Symbol(str)
end

🎉
I don't really see any downside to that, except that people will also want to write quoted symbols, which you can do with the subtly-different syntax Symbol("..."). See also #9945. So the only point of this would be nicer printing of Exprs containing non-identifier symbols.

@c42f
Copy link
Member Author

c42f commented Jun 27, 2019

🎉 ... wait, what :-)

That's a neat idea I think we just need a better name for it: when used in a block of code, it doesn't produce a (quoted) Symbol, but rather an identifier [edit: yes, an unquoted symbol is an identifier. But I was really expecting the quoted version there]. So we could have ident"##", name"##" var"##" or some such thing? Ugly is kind of a feature here, but the current version of $(Symbol("##")) is a bit too ugly.

@JeffBezanson
Copy link
Sponsor Member

Agreed; var" " seems nice. Hopefully doesn't conflict with anything in a package.

@c42f
Copy link
Member Author

c42f commented Jul 3, 2019

I tried searching github for this and it turns out the RCall defines @var_str... to be exactly the macro we want: https://github.com/JuliaInterop/RCall.jl/blob/271d9e2ace285417666b731cda3f7b33d60f6aea/src/macros.jl#L79

It seems @simonbyrne added this in JuliaInterop/RCall.jl#71 (comment)

@c42f c42f force-pushed the cjf/fix-show-symbols-in-exprs branch from 8862925 to 8be46c4 Compare July 3, 2019 08:05
@c42f
Copy link
Member Author

c42f commented Jul 3, 2019

@var_str is added. Still need to clean up the old Test code which is abusing Symbol to store strings.

@c42f
Copy link
Member Author

c42f commented Jul 3, 2019

I think the look of @var_str in macro expansions is quite satisfying -

julia> @macroexpand @info "msg" x=1 y=2
quote
    #= logging.jl:305 =#
    var"#25#level" = Base.CoreLogging.Info
    #= logging.jl:306 =#
    var"#26#std_level" = Base.CoreLogging.convert(Base.CoreLogging.LogLevel, var"#25#level")
    #= logging.jl:307 =#
    if var"#26#std_level" >= Base.CoreLogging.getindex(Base.CoreLogging._min_enabled_level)
        #= logging.jl:308 =#
        var"#27#group" = $(QuoteNode("REPL[23]"))
        #= logging.jl:309 =#
        var"#28#_module" = Main
        #= logging.jl:310 =#
        var"#29#logger" = Base.CoreLogging.current_logger_for_env(var"#26#std_level", var"#27#group", var"#28#_module")
        #= logging.jl:311 =#
        if !(var"#29#logger" === Base.CoreLogging.nothing)
            #= logging.jl:312 =#
... etc etc

@KristofferC
Copy link
Sponsor Member

KristofferC commented Jul 3, 2019

Not sure if this matters but e.g.

try catch var"#89#err" end

is currently a syntax error and this gets printed from macro expansions.

@c42f
Copy link
Member Author

c42f commented Jul 3, 2019

@KristofferC oh good catch. This particular case is easily fixed in the parser; not ideal but I think acceptable given that $ interpolation is already special cased in the same way.

@simonbyrne
Copy link
Contributor

I seem to recall that there are some edge cases where it acts a little unexpectedly. The only one I can think of at the moment is when it is used after a ., e.g.

julia> macro var_str(str)
           esc(Symbol(str))
       end
@var_str (macro with 1 method)

julia> x = 1+2im
1 + 2im

julia> x.im
2

julia> x.var"im"
ERROR: LoadError: type Complex has no field @var_str
Stacktrace:
 [1] getproperty(::Any, ::Symbol) at ./sysimg.jl:18
in expression starting at REPL[4]:1

@KristofferC
Copy link
Sponsor Member

But wouldn't you have to write getproperty(x, var"im") anyway?

@JeffBezanson
Copy link
Sponsor Member

That would be getproperty(x, Symbol("im")).

True, without parser support var"" can't be a full drop-in replacement for identifiers.

@c42f
Copy link
Member Author

c42f commented Jul 3, 2019

I've added some parser support to fix the try catch var"exc" end case pointed out by @KristofferC but there's a few other problematic cases still. I'm not quite sure how much of a game of whack-a-mole that's going to turn out to be but the fact that $-interpolation works makes me hopeful.

And there I was, thinking it would be a guilty pleasure to tackle an easy "good first issue" like #32354 and get it sorted out within the hour :-)

@c42f c42f force-pushed the cjf/fix-show-symbols-in-exprs branch from 1fcfcc0 to 947efbc Compare July 6, 2019 21:15
@c42f
Copy link
Member Author

c42f commented Jul 6, 2019

Good example @simonbyrne. I think I've got most of the cases in the parser sorted out now.

For the x.var"im" case I had to expand the syntax in the parser as it expects the symbol im to be quoted (whereas var"im" expands to an unquoted name im). Supporting printing of the @var_str macros in unusual places like import also causes some surprises in show which I haven't finished sorting out yet.

Taken together, I think it might be better to declare that var"str" is a special syntax which directly interpolates $(Symbol("str")) at parse time. Given that we already need parser support that would seem to be simpler and more consistent.

@JeffBezanson
Copy link
Sponsor Member

Makes sense; we can just add one case to parse-atom instead of lots of special cases everywhere..

@JeffBezanson
Copy link
Sponsor Member

We could try to find a non-breaking syntax for this (i.e. something that's currently an error). One option is @"name".

@c42f
Copy link
Member Author

c42f commented Jul 7, 2019

we can just add one case to parse-atom instead of lots of special cases everywhere

Agreed, that will be a lot cleaner.

We could try to find a non-breaking syntax for this (i.e. something that's currently an error). One option is @"name".

Ok that's an interesting idea. I wouldn't want to steal useful syntax for this but @"name" is kind of odd looking enough that it's not obvious that it should mean something.

@StefanKarpinski
Copy link
Sponsor Member

StefanKarpinski commented Jul 7, 2019

$”string” is valid syntax but pretty useless. I had proposed :”#funky name” as syntax for Symbol(“#funky name”) which would mean that $:”#funky name” would be the syntax for the variable. Perhaps too odd looking? The syntax for funky symbols would be slightly breaking since :”foo” is currently a convoluted way to write ”foo”.

Perhaps something that is really non-breaking like Jeff’s @“#funky name” would be better after all—it more directly solves the problem here which is writing strange variables, not strange symbols. Would that mean that :@“foo” would become an alternate way to write :foo?

@c42f
Copy link
Member Author

c42f commented Jul 8, 2019

Would that mean that :@“foo” would become an alternate way to write :foo?

Yes that's a natural consequence of the way the parser works. Good point.

So comparing two alternatives for unquoted / quoted symbols:

  • @"funky" / :@"funky" (variable name / quote a variable name)
  • $:"funky" / :"funky" (unquote a quoted variable name / quote a variable name)

The second is a bit double-negative-y, not backward compatible and less natural in the parser. On the plus side it gives a neat syntax for quoted symbol literals and an uglier syntax for funky variable names.

@StefanKarpinski
Copy link
Sponsor Member

StefanKarpinski commented Jul 8, 2019

Additional considerations which are kind of subjective:

  • :"funky" looks suggestive (to me) of what it is proposed to mean—a way of writing a symbol with strange content. This is also the syntax for unusual symbols in Ruby and since the :foo syntax was Ruby-inspired in the first place, it seems fitting to also have :"foo" mean the same thing. Note that Ruby allows interpolation in such a case, which is also handy.

  • $:"funky" looks suggestive of splicing something into the code, which is what is going on here. If I see quote $:foo = 1 end then I know that produces quote foo = 1 end and quote $:"funky" = 1 end looks suggestive of something similar. The fact that :"funky" looks like it might mean :funky means that I can guess what's going on here.

  • On the other hand, @"funky" looks like a macro call and :@"funky" looks like a quoted macro call.

Regarding breakingness of :"funky" for :funky, I really doubt that anyone in the wild is using :"funky" as a syntax for writing a string literal. So while this is technically breaking, I think it's very technical. Splicing a string value immediately after a : is already a syntax error:

julia> x = "funky"; :$x
ERROR: syntax: extra token "x" after end of expression

julia> x = "funky"; :$(x)
ERROR: MethodError: objects of type Symbol are not callable

So we don't really even have a referential transparency type of situation here—it would only break the literal use of :"foo" to write an immediate string literal.

@JeffBezanson
Copy link
Sponsor Member

I share the concern that @"a" looks like it has something to do with calling a macro. So I'd default to picking var"a". Colons and dollar signs can already get quite confusing. In particular, the most common use of interpolation looks like $identifier. So if a $ is required to write the identifier itself, you need an extra layer of @eval, which gets very annoying to deal with.

@c42f
Copy link
Member Author

c42f commented Jul 8, 2019

I tried out @"foo", it's a bit visually confusing.

var"str" has the advantage that it's visually familiar to anyone who already knows about string macros. The fact that it starts with an identifier makes it look more natural in various syntactic contexts. Some examples:

# R identifiers
foo(x, y, var"draw.segments" = 1)
foo(x, y, @"draw.segments" = 1)
foo(x, y, $:"draw.segments" = 1) 

# Constructing weirdly named types
struct var"##"
    var"hey-"
end
x = var"##"(1)
x.var"hey-"

struct @"##"
    @"hey-"
end
x = @"##"(1)
x.@"hey-"

struct $:"##"
    $:"hey-"
end
x = $:"##"(1)
x.$:"hey-"

We don't get a very natural way to write quoted symbol literals with it; x = :var"foo" looks odd. We could just have sym"foo" or :"foo" separately.

So if a $ is required to write the identifier itself, you need an extra layer of @eval, which gets very annoying to deal with.

Would it be a problem for the parser to eagerly interpolate $:"foo"? That doesn't seem different from declaring that var"foo" is special syntax and expanding it in the parser.

@c42f c42f changed the title Fix show of non-identifier Symbols in Exprs Syntax for unparsable identifiers and quoted symbol literals Jul 9, 2019
@c42f
Copy link
Member Author

c42f commented Jul 9, 2019

Well, I continue to make a mess here, but I'll rebase all that rubbish away once the design converges. As of now, the parser changes include two modifications to parse-atom:

  • Parse var"foo" as the identifier name foo
  • Implement Stefan's proposed symbol literal syntax :"foo" to produce the quoted symbol :foo. I also translated :"foo$x" to Core.Symbol("foo$x") as I expect people would be surprised if this doesn't work. (Using Core feels icky... perhaps using Expr(:symbol, "foo", x) in analogy to Expr(:string, "foo", x) would be better?)

Subjectively I think these are the least surprising syntaxes which have been proposed so far in this thread.

@JeffBezanson
Copy link
Sponsor Member

I'm not a big fan of :"a", even though it is visually nice and concise, since it's not consistent with colon as a quoting operator, and since Symbol("a") can already be used for that. We can tackle these separately.

@marius311
Copy link
Contributor

I think this commit broke the following:

julia> var="foo"; run(`echo "$var"`)
ERROR: LoadError: unterminated double quote
Stacktrace:
 [1] error(::String) at ./error.jl:33
 [2] #shell_parse#350(::String, ::typeof(Base.shell_parse), ::String, ::Bool) at ./shell.jl:106
 [3] #shell_parse at ./none:0 [inlined] (repeats 2 times)
 [4] @cmd(::LineNumberNode, ::Module, ::Any) at ./cmd.jl:387
in expression starting at REPL[46]:1

Any other variable name besides var works.

In any case surely no important package has the bad luck of triggering such an obscure bug :)

marius311 added a commit to marius311/julia that referenced this pull request Aug 17, 2019
@c42f
Copy link
Member Author

c42f commented Aug 17, 2019

Good catch, though I'm not sure why you refer to PyCall. That particular use of $var is terminated by a closing parenthesis and should not be affected by this bug? [Edit: it turns out that Meta.parse() in positional mode is particularly sensitive to underlying stream position.]

In any case we do have a problem:

julia> Meta.parse("\"\$var\"")
:($(Expr(:incomplete, "incomplete: invalid string syntax")))

Looking at the parser code I think this particular case can be fixed (thankfully we don't support arbitrary atoms in string interpolation syntax). Your example hits the shell_parse code path which is slightly different as it calls Meta.parse().

@c42f
Copy link
Member Author

c42f commented Aug 17, 2019

Ah, the error you're seeing in PyCall is a few lines down (https://github.com/JuliaPy/PyCall.jl/blob/6dcf5c2e1ac1399f0ad4f7d4c7cdadfd21faa524/deps/buildutils.jl#L8); it's unrelated to shell_parse and basically the same root cause as my Meta.parse example:

julia> dump(Meta.parse("\"'\$var'\""))
Expr
  head: Symbol '
  args: Array{Any}((1,))
    1: Expr
      head: Symbol string
      args: Array{Any}((2,))
        1: String "'"
        2: Symbol var

@c42f
Copy link
Member Author

c42f commented Aug 18, 2019

So there's three problems here:

  • var syntax shouldn't be allowed in string interpolations (only identifiers are allowed). This seems easy to fix by not calling parse-atom while parsing string interpolations.

  • var syntax shouldn't be allowed in command string interpolations. This is more awkward to fix in isolation because Base.shell_parse already accepts some pretty weird things due to its use of Meta.parse in non-greedy mode, which ultimately calls into parse-atom. This behavior is inconsistent with string interpolation (see parse command interpolations at parse time #3150).

  • The use of Meta.parse non-greedy mode combined with peek-token is inconsistent and returns the wrong position for the end of the parsed atom. This would cause Base.shell_parse("\"(\$var)\"") to miss the trailing ) character. Arguably this is an existing bug which interacts in a bad way with the changes here. For example, in julia 1.x

    julia> Meta.parse(":,", 1, greedy=false)
    (:(:), 3)  # should return (:(:), 2)
    

@c42f
Copy link
Member Author

c42f commented Aug 18, 2019

Ok, I believe I've got a fix for the PyCall problems at #32948.

@marius311
Copy link
Contributor

Awesome, @c42f, thanks for taking a quick look sorry to throw you off a bit with my alternate example!

@StefanKarpinski
Copy link
Sponsor Member

This is proving a bit more problematic than expected. Perhaps we got a little overeager and revert?

@c42f
Copy link
Member Author

c42f commented Aug 19, 2019

Thanks @marius311 for the quick reporting. It turned out that your example was also an important bug. I was confused because cmd interpolations are parsed quite differently from string interpolations.

@StefanKarpinski I think all the problems we know about should now be fixed in #32948 (awaiting CI) so I'd like to see how that pans out. This did have a lot more fallout than anticipated 😬

@StefanKarpinski
Copy link
Sponsor Member

Yes, if that's the last of it then we can proceed.

@chriscoey
Copy link

DynamicPolynomials.jl fails to precompile on Julia master - perhaps due to this PR? Please see JuliaAlgebra/DynamicPolynomials.jl#50 (comment)

findmyway referenced this pull request in JuliaReinforcementLearning/ReinforcementLearningEnvironments.jl Aug 28, 2019
findmyway referenced this pull request in JuliaReinforcementLearning/ReinforcementLearningEnvironments.jl Aug 28, 2019
* unify interfaces

* update screen after interact

* add minor comment

* add version check

* update docker

* revert to juali v1.2 due to https://github.com/JuliaLang/julia/pull/32408\#issuecomment-522168938

* update README
findmyway referenced this pull request in JuliaReinforcementLearning/ReinforcementLearning.jl May 3, 2021
* unify interfaces

* update screen after interact

* add minor comment

* add version check

* update docker

* revert to juali v1.2 due to https://github.com/JuliaLang/julia/pull/32408\#issuecomment-522168938

* update README
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bugfix This change fixes an existing bug display and printing Aesthetics and correctness of printed representations of objects. minor change Marginal behavior change acceptable for a minor release parser Language parsing and surface syntax
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Printing of macro symbols displays like a macrocall
7 participants