Crash on f-string with `\{` #4421

JelleZijlstra · 2024-07-31T17:26:14Z

Black fails to format a valid Python f-string containing \{. I minimized this from a file in Pygments, pygments/lexers/int_fiction.py.

% cat rf.py 
rf'{a}\{{[\}}'
% black --check rf.py
error: cannot format rf.py: Cannot parse: 1:10: rf'{a}\{{[\}}'

Oh no! 💥 💔 💥
1 file would fail to reformat.
% python rf.py 
Traceback (most recent call last):
  File "/Users/jelle/py/black/rf.py", line 1, in <module>
    rf'{a}\{{[\}}'
        ^
NameError: name 'a' is not defined
% python -V
Python 3.12.4
% black --version
black, 24.4.3.dev27+g7fa1faf (compiled: no)
Python (CPython) 3.12.4

cc @tusharsadhwani for f-strings.

The text was updated successfully, but these errors were encountered:

tusharsadhwani · 2024-07-31T17:28:36Z

>>> print(rf'{1}\{{[\}}')
1\{[\}

small tweak to the example as it shows expected formatting output

tusharsadhwani · 2024-07-31T17:37:09Z

Minimal repro: f'{1}\{{'

JelleZijlstra · 2024-07-31T17:46:50Z

Here's another example that doesn't yield a Black crash but that we parse incorrectly:

>>> black.parsing.lib2to3_parse(r"rf'\{{ {a}'")
Node(file_input, [Node(simple_stmt, [Node(fstring, [Leaf(FSTRING_START, "rf'"), Leaf(FSTRING_MIDDLE, '\\{{ {a}'), Leaf(FSTRING_END, "'")]), Leaf(NEWLINE, '\n')]), Leaf(ENDMARKER, '')])

The {a} part gets put inside an FSTRING_MIDDLE leaf, when it should be an fstring_replacement_field node. Compare the tree if you remove the backslash:

>>> black.parsing.lib2to3_parse(r"rf'{{ {a}'")
Node(file_input, [Node(simple_stmt, [Node(fstring, [Leaf(FSTRING_START, "rf'"), Leaf(FSTRING_MIDDLE, '{{ '), Node(fstring_replacement_field, [Leaf(LBRACE, '{'), Leaf(NAME, 'a'), Leaf(RBRACE, '}')]), Leaf(FSTRING_MIDDLE, ''), Leaf(FSTRING_END, "'")]), Leaf(NEWLINE, '\n')]), Leaf(ENDMARKER, '')])

JelleZijlstra · 2024-07-31T17:50:05Z

And another one:

>>> black.parsing.lib2to3_parse(r"rf'\{1}'")
Node(file_input, [Node(simple_stmt, [Node(fstring, [Leaf(FSTRING_START, "rf'"), Leaf(FSTRING_MIDDLE, '\\{1}'), Leaf(FSTRING_END, "'")]), Leaf(NEWLINE, '\n')]), Leaf(ENDMARKER, '')])
>>> rf'\{1}'
'\\1'

tusharsadhwani · 2024-07-31T17:50:21Z

Normal case:

$ cat foo.py
f'foo \{{'

$ python -m blib2to3.pgen2.tokenize foo.py
1,0-1,2:        FSTRING_START   "f'"
1,2-1,9:        FSTRING_MIDDLE  'foo \\{{'
1,9-1,10:       FSTRING_END     "'"
1,10-1,11:      NEWLINE '\n'
2,0-2,0:        ENDMARKER       ''

Abnormal case:

$ cat foo.py
f'{1} \{{'

$ python -m blib2to3.pgen2.tokenize foo.py
1,0-1,2:        FSTRING_START   "f'"
1,2-1,2:        FSTRING_MIDDLE  ''
1,2-1,3:        LBRACE  '{'
1,3-1,4:        NUMBER  '1'
1,4-1,5:        RBRACE  '}'
1,5-1,8:        FSTRING_MIDDLE  ' \\{'
1,8-1,9:        LBRACE  '{'
1,9-1,10:       ERRORTOKEN      "'"
1,10-1,11:      NL      '\n'
Traceback (most recent call last):
[...]

It deviates at parsing \{{ successfully, but only after having parsed at least one braced expression. This also explains the deviation in parsing.

tusharsadhwani · 2024-07-31T17:54:47Z

Yeah this is interesting too.

$ python -m tokenize <<<"rf'\{2+3}'"
1,0-1,3:            FSTRING_START  "rf'"          
1,3-1,4:            FSTRING_MIDDLE '\\'           
1,4-1,5:            OP             '{'            
1,5-1,6:            NUMBER         '2'            
1,6-1,7:            OP             '+'            
1,7-1,8:            NUMBER         '3'            
1,8-1,9:            OP             '}'            
1,9-1,10:           FSTRING_END    "'"            
1,10-1,11:          NEWLINE        '\n'           
2,0-2,0:            ENDMARKER      ''             

$ python -m blib2to3.pgen2.tokenize <<<"rf'\{2+3}'"
1,0-1,3:        FSTRING_START   "rf'"
1,3-1,9:        FSTRING_MIDDLE  '\\{2+3}'
1,9-1,10:       FSTRING_END     "'"
1,10-1,11:      NEWLINE '\n'
2,0-2,0:        ENDMARKER       ''

tusharsadhwani · 2024-07-31T17:56:34Z

Since we don't format expressions inside f-strings right now, this wouldn't have been caught for a long time, but thanks to this bug we know of this regression :D

JelleZijlstra · 2024-07-31T18:05:55Z

You can also construct crashes from these bad parses, if the replacement field contains nested strings relying on PEP 701:

% black -c 'rf"\{"a"}"'
rf"\{"a"}"
error: cannot format <string>: Cannot parse: 1:6: rf"\{"a"}"

JelleZijlstra added the T: bug Something isn't working label Jul 31, 2024

tusharsadhwani mentioned this issue Jul 31, 2024

fix: respect braces better in f-string parsing #4422

Merged

JelleZijlstra closed this as completed in #4422 Aug 2, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Crash on f-string with `\{` #4421

Crash on f-string with `\{` #4421

JelleZijlstra commented Jul 31, 2024

tusharsadhwani commented Jul 31, 2024 •

edited

Loading

tusharsadhwani commented Jul 31, 2024

JelleZijlstra commented Jul 31, 2024

JelleZijlstra commented Jul 31, 2024

tusharsadhwani commented Jul 31, 2024 •

edited

Loading

tusharsadhwani commented Jul 31, 2024

tusharsadhwani commented Jul 31, 2024

JelleZijlstra commented Jul 31, 2024

Crash on f-string with \{ #4421

Crash on f-string with \{ #4421

Comments

JelleZijlstra commented Jul 31, 2024

tusharsadhwani commented Jul 31, 2024 • edited Loading

tusharsadhwani commented Jul 31, 2024

JelleZijlstra commented Jul 31, 2024

JelleZijlstra commented Jul 31, 2024

tusharsadhwani commented Jul 31, 2024 • edited Loading

tusharsadhwani commented Jul 31, 2024

tusharsadhwani commented Jul 31, 2024

JelleZijlstra commented Jul 31, 2024

Crash on f-string with `\{` #4421

Crash on f-string with `\{` #4421

tusharsadhwani commented Jul 31, 2024 •

edited

Loading

tusharsadhwani commented Jul 31, 2024 •

edited

Loading