Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tokenize 0.5.19 broke JuMPs formatting #189

Closed
odow opened this issue Aug 24, 2021 · 5 comments · Fixed by #191
Closed

Tokenize 0.5.19 broke JuMPs formatting #189

odow opened this issue Aug 24, 2021 · 5 comments · Fixed by #191

Comments

@odow
Copy link

odow commented Aug 24, 2021

jump-dev/JuMP.jl#2673

A passing log is (Tokenize 0.5.18):

https://github.com/jump-dev/JuMP.jl/runs/3391118073

A failing log is (Tokenize 0.5.19):

https://github.com/jump-dev/JuMP.jl/runs/3416358585

Confirmed that the only difference between the two runs is the minor bump of tokenize, and verified this locally. It's also weird in that I can only reproduce on linux. Mac works fine.

(@v1.6) pkg> st
      Status `~/.julia/environments/v1.6/Project.toml`
  [12aac903] BinaryBuilder v0.4.0
  [98e50ef6] JuliaFormatter v0.13.2
  [0796e94c] Tokenize v0.5.18

julia> using JuliaFormatter; format("src/lp_sensitivity.jl")
true

(@v1.6) pkg> add Tokenize@0.5.19
    Updating registry at `~/.julia/registries/General`
    Updating git-repo `https://github.com/JuliaRegistries/General.git`
   Resolving package versions...
    Updating `~/.julia/environments/v1.6/Project.toml`
  [0796e94c]  Tokenize v0.5.18  v0.5.19
    Updating `~/.julia/environments/v1.6/Manifest.toml`
  [0796e94c]  Tokenize v0.5.18  v0.5.19
  Progress [========================================>]  3/3
  ? CSTParser
  ? JuliaFormatter
1 dependency successfully precompiled in 4 seconds (64 already precompiled)
2 dependencies failed but may be precompilable after restarting julia

julia> exit()
(base) parallels@parallels-Parallels-Virtual-Platform:~/.julia/dev/JuMP$ ~/julia-1.6
               _
   _       _ _(_)_     |  Documentation: https://docs.julialang.org
  (_)     | (_) (_)    |
   _ _   _| |_  __ _   |  Type "?" for help, "]?" for Pkg help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 1.6.0 (2021-03-24)
 _/ |\__'_|_|_|\__'_|  |  Official https://julialang.org/ release
|__/                   |

(@v1.6) pkg> st
      Status `~/.julia/environments/v1.6/Project.toml`
  [12aac903] BinaryBuilder v0.4.0
  [98e50ef6] JuliaFormatter v0.13.2
  [0796e94c] Tokenize v0.5.19

julia> using JuliaFormatter; format("src/lp_sensitivity.jl")
[ Info: Precompiling JuliaFormatter [98e50ef6-434e-11e9-1051-2b60c6c9e899]
ERROR: Parsing error for input occurred on line 419, offset: 79
Stacktrace:
  [1] error(s::String)
    @ Base ./error.jl:33
  [2] format_text(text::String, style::DefaultStyle, opts::JuliaFormatter.Options)
    @ JuliaFormatter ~/.julia/packages/JuliaFormatter/FlpTt/src/JuliaFormatter.jl:322
  [3] #format_text#161
    @ ~/.julia/packages/JuliaFormatter/FlpTt/src/JuliaFormatter.jl:316 [inlined]
  [4] #format_text#160
    @ ~/.julia/packages/JuliaFormatter/FlpTt/src/JuliaFormatter.jl:310 [inlined]
  [5] format_file(filename::String; overwrite::Bool, verbose::Bool, format_markdown::Bool, format_options::Base.Iterators.Pairs{Symbol, Integer, NTuple{5, Symbol}, NamedTuple{(:short_to_long_function_def, :remove_extra_newlines, :margin, :always_for_in, :always_use_return), Tuple{Bool, Bool, Int64, Bool, Bool}}})
    @ JuliaFormatter ~/.julia/packages/JuliaFormatter/FlpTt/src/JuliaFormatter.jl:421
  [6] format(paths::Tuple{String}; options::Base.Iterators.Pairs{Union{}, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
    @ JuliaFormatter ~/.julia/packages/JuliaFormatter/FlpTt/src/JuliaFormatter.jl:528
  [7] format
    @ ~/.julia/packages/JuliaFormatter/FlpTt/src/JuliaFormatter.jl:504 [inlined]
  [8] #format#170
    @ ~/.julia/packages/JuliaFormatter/FlpTt/src/JuliaFormatter.jl:553 [inlined]
  [9] format(path::String)
    @ JuliaFormatter ~/.julia/packages/JuliaFormatter/FlpTt/src/JuliaFormatter.jl:553
 [10] top-level scope
    @ REPL[2]:1

I'll try to narrow it down further

@odow
Copy link
Author

odow commented Aug 24, 2021

I think this might be my new favorite bug of all time: \ ej

(@v1.6) pkg> st
      Status `~/.julia/environments/v1.6/Project.toml`
  [12aac903] BinaryBuilder v0.4.1
  [98e50ef6] JuliaFormatter v0.15.8
  [0796e94c] Tokenize v0.5.19

julia> using JuliaFormatter
[ Info: Precompiling JuliaFormatter [98e50ef6-434e-11e9-1051-2b60c6c9e899]

julia> format_text(raw"A \ ei")
"A \\ ei"

julia> format_text(raw"A \ ej")
ERROR: Parsing error for input occurred on line 1, offset: 3
Stacktrace:
 [1] error(s::String)
   @ Base ./error.jl:33
 [2] format_text(text::String, style::DefaultStyle, opts::JuliaFormatter.Options)
   @ JuliaFormatter ~/.julia/packages/JuliaFormatter/8V7tA/src/JuliaFormatter.jl:365
 [3] #format_text#185
   @ ~/.julia/packages/JuliaFormatter/8V7tA/src/JuliaFormatter.jl:359 [inlined]
 [4] format_text
   @ ~/.julia/packages/JuliaFormatter/8V7tA/src/JuliaFormatter.jl:357 [inlined]
 [5] #format_text#184
   @ ~/.julia/packages/JuliaFormatter/8V7tA/src/JuliaFormatter.jl:353 [inlined]
 [6] format_text(text::String)
   @ JuliaFormatter ~/.julia/packages/JuliaFormatter/8V7tA/src/JuliaFormatter.jl:353
 [7] top-level scope
   @ REPL[4]:1

julia> format_text(raw"A \ ek")
"A \\ ek"

(@v1.6) pkg> add Tokenize@0.5.18
    Updating registry at `~/.julia/registries/General`
    Updating git-repo `https://github.com/JuliaRegistries/General.git`
   Resolving package versions...
    Updating `~/.julia/environments/v1.6/Project.toml`
  [0796e94c]  Tokenize v0.5.19  v0.5.18
    Updating `~/.julia/environments/v1.6/Manifest.toml`
  [0796e94c]  Tokenize v0.5.19  v0.5.18
  Progress [========================================>]  3/3
  ? CSTParser
  ? JuliaFormatter
1 dependency successfully precompiled in 4 seconds (61 already precompiled)
2 dependencies failed but may be precompilable after restarting julia

julia> exit()
(base) parallels@parallels-Parallels-Virtual-Platform:~/.julia/dev/JuMP$ ~/julia-1.6
               _
   _       _ _(_)_     |  Documentation: https://docs.julialang.org
  (_)     | (_) (_)    |
   _ _   _| |_  __ _   |  Type "?" for help, "]?" for Pkg help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 1.6.0 (2021-03-24)
 _/ |\__'_|_|_|\__'_|  |  Official https://julialang.org/ release
|__/                   |

julia> using JuliaFormatter
[ Info: Precompiling JuliaFormatter [98e50ef6-434e-11e9-1051-2b60c6c9e899]

julia> format_text(raw"A \ ei")
"A \\ ei"

julia> format_text(raw"A \ ej")
"A \\ ej"

julia> format_text(raw"A \ ek")
"A \\ ek"

But no seriously, \(arg, ej) fails to pass, no matter how you word it

julia> format_text(raw"\(a, ek)")
"\\(a, ek)"

julia> format_text(raw"\(a, ei)")
"\\(a, ei)"

julia> format_text(raw"\(a, ej)")
ERROR: Parsing error for input occurred on line 1, offset: 7
Stacktrace:
 [1] error(s::String)
   @ Base ./error.jl:33
 [2] format_text(text::String, style::DefaultStyle, opts::JuliaFormatter.Options)
   @ JuliaFormatter ~/.julia/packages/JuliaFormatter/8V7tA/src/JuliaFormatter.jl:365
 [3] #format_text#185
   @ ~/.julia/packages/JuliaFormatter/8V7tA/src/JuliaFormatter.jl:359 [inlined]
 [4] format_text
   @ ~/.julia/packages/JuliaFormatter/8V7tA/src/JuliaFormatter.jl:357 [inlined]
 [5] #format_text#184
   @ ~/.julia/packages/JuliaFormatter/8V7tA/src/JuliaFormatter.jl:353 [inlined]
 [6] format_text(text::String)
   @ JuliaFormatter ~/.julia/packages/JuliaFormatter/8V7tA/src/JuliaFormatter.jl:353
 [7] top-level scope
   @ REPL[20]:1

julia> format_text(raw"\(a, ek)")
"\\(a, ek)"

@odow
Copy link
Author

odow commented Aug 24, 2021

Well dang...

julia> collect(tokenize(raw"a \ ei"))
6-element Vector{Tokenize.Tokens.Token}:
 1,1-1,1          IDENTIFIER     "a"
 1,2-1,2          WHITESPACE     " "
 1,3-1,3          OP             "\\"
 1,4-1,4          WHITESPACE     " "
 1,5-1,6          IDENTIFIER     "ei"
 1,7-1,6          ENDMARKER      ""

julia> collect(tokenize(raw"a \ ej"))
6-element Vector{Tokenize.Tokens.Token}:
 1,1-1,1          IDENTIFIER     "a"
 1,2-1,2          WHITESPACE     " "
 1,3-1,3          OP             "\\"
 1,4-1,4          WHITESPACE     " "
 1,5-1,6          KEYWORD        "if"
 1,7-1,6          ENDMARKER      ""

@odow
Copy link
Author

odow commented Aug 24, 2021

@ZacLN it's a hash collision with the change you introduced here #188

julia> Tokenize.Lexers.simple_hash("ej")
21626

julia> Tokenize.Lexers.simple_hash("if")
21626

@odow
Copy link
Author

odow commented Aug 25, 2021

Here's all the collisions of length <=4 matching (a-zA-Z)+. It seems we need a proper hash, or a check if the identifier matches the corresponding keyword.

ne => DO
ej => IF
mj => IN
rja => ISA
mfd => END
ngr => FOR
dmt => LET
quy => TRY
xupe => TYPE
neqe => ELSE
doqe => ELSE
kfse => ELSE
quue => TRUE

@KristofferC
Copy link
Sponsor Member

Thanks for the issue, I will make a new version with that reverted.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants