Lex Jupyter line magic with `Mode::Jupyter` #23

dhruvmanila · 2023-07-05T17:33:05Z

Lex Jupyter line magic with Mode::Jupyter

This PR adds a new token MagicCommand¹ which the lexer will recognize when in Mode::Jupyter. The rules for the lexer is as follows:

Given that we are at the start of line, skip the indentation and look for characters that represent the start of a magic command, determine the magic kind and capture all the characters following it as the command string.
If the command extends multiple lines, the lexer will skip the line continuation character (\) but only if it's followed by a newline (\n or \r). The reason to skip this only in case of newline is because they can occur in the command string which we should not skip:
```
//        Skip this backslash
//        v
//   !pwd \
//      && ls -a | sed 's/^/\\    /'
//                          ^^
//                          Don't skip these backslashes
```
The parser, when in Mode::Jupyter, will filter these tokens before the parsing begins. There is a small caveat when the magic command is indented. In the following example, when the parser filters out magic command, it'll throw an indentation error:
```
for i in range(5):
	!ls

# What the parser will see
for i in range(5):
```

I would prefer to have some other name as this not only represent a line magic (%) but also shell command (!), help command (?) and others. In original implementation, it's named as "IPython Syntax" ↩

dhruvmanila · 2023-07-05T17:33:17Z

Current dependencies on/for this PR:

main
- PR Lex Jupyter line magic with Mode::Jupyter #23 👈

This comment was auto-generated by Graphite.

MichaReiser

Neat!

I'm a bit concerned about how we want to disambiguate magics from modulo operators that split over multiple lines (something our formatter will create). Let's add some test cases for module operators to see if we handle them correctly

Same for / and even comma

# This is a tuple
(
	a
	,
)

# This is a division operation
(
	a
	/
	b
)

core/src/mode.rs

parser/src/lexer.rs

MichaReiser · 2023-07-14T06:51:49Z

parser/src/parser.rs

+# Indented magic
+for a in range(5):
+    %ls
+    pass


Can we add some cases that look like magics but are none:

a % 5 ( a % 5 ) % a \ % 5

a % 5 ( a % 5 ) % # [*] a \ % 5

They should all work assuming that for the marked ([*]) line, the % sign is a standalone (not part of another expression). It isn't as Python throws a Syntax error on that line and the tokenizer gives the MagicCommand token only for that line.

Ah, cool! Does this work because we only call into lex_indentation when the lexr is not in a parenthesized expression?

Yes that's correct.

dhruvmanila · 2023-07-15T15:02:15Z

parser/src/lexer.rs

+                #[cfg(feature = "full-lexer")]
+                Tok::NonLogicalNewline,


I'm not sure if this is the correct way to do this. It seems the tests in CI are not run using --all-features flag.

I think the easiest is to remove the feature flag. We always use full-lexer.

Yes, but the CI doesn't use it:

RustPython-Parser/.github/workflows/ci.yaml

Line 18 in 2d1f69c

CARGO_ARGS: --no-default-features --features stdlib,zlib,importlib,encodings,ssl,jit

Or, it uses it only for certain steps:

RustPython-Parser/.github/workflows/ci.yaml

Line 58 in 2d1f69c

run: cargo clippy --all --features malachite-bigint,full-lexer,serde -- -Dwarnings

MichaReiser · 2023-07-17T08:45:47Z

core/src/mode.rs

+    /// [line magics]: https://ipython.readthedocs.io/en/stable/interactive/magics.html#line-magics
+    /// [Dynamic object information]: https://ipython.readthedocs.io/en/stable/interactive/reference.html#dynamic-object-information
+    /// [System shell access]: https://ipython.readthedocs.io/en/stable/interactive/reference.html#system-shell-access
+    /// [Automatic parentheses and quotes]: https://ipython.readthedocs.io/en/stable/interactive/reference.html#automatic-parentheses-and-quotes


Thanks. This comment is excellent!

parser/src/lexer.rs

MichaReiser · 2023-07-17T08:55:10Z

parser/src/lexer.rs

+                #[cfg(feature = "full-lexer")]
+                Tok::NonLogicalNewline,


I think the easiest is to remove the feature flag. We always use full-lexer.

MichaReiser · 2023-07-17T08:56:35Z

parser/src/lexer.rs

+                    kind: MagicKind::Magic,
+                },
+                #[cfg(feature = "full-lexer")]
+                Tok::NonLogicalNewline,


What's the reasoning that this is a non logical newline. Isn't it a logical newline, because it terminates a statement?

Can we add a test where a magic command uses an invalid indent.

What's the reasoning that this is a non logical newline. Isn't it a logical newline, because it terminates a statement?

My main reason being that as these tokens are filtered out before parsing, they should end with a NonLogicalNewline as otherwise there'll be multiple newline tokens i.e., multiple blank lines. This is similar to the Comment token.

Although, now that I think of it, without the full-lexer feature (but with Mode::Jupyter), the NonLogicalNewline won't be filtered out while the MagicCommand token will be.

Can we add a test where a magic command uses an invalid indent.

Do you mean something like the following?

for i in range(10): print('hello') !pwd

Do you mean something like the following?

Yes, exactly. Sorry, I should have provided an example.

We've decided to move ahead with the current implementation but will have to consider indentation once the parser is updated to account for any indentation error.

MichaReiser · 2023-07-17T08:57:50Z

parser/src/parser.rs

+/// When in [`Mode::Jupyter`], this will filter out all the Jupyter magic commands
+/// before parsing the tokens.
+///


Is this temporary or the final solution? I assumed that our plan is to introduce a new MagicCommand ast node.

Yes, we will be adding a new AST node for this (probably 2).

MichaReiser · 2023-07-17T08:58:51Z

Very well done!

dhruvmanila force-pushed the dhruv/jupyter-magic-syntax branch from 0977561 to 17bdc29 Compare July 6, 2023 04:08

dhruvmanila mentioned this pull request Jul 6, 2023

Use Jupyter mode while parsing Notebook files astral-sh/ruff#5552

Merged

dhruvmanila added 2 commits July 7, 2023 18:00

Lex Jupyter line magic with Mode::Jupyter

c64392c

Ignore MagicCommand token when parsing in Jupyter mode

94fafb5

dhruvmanila force-pushed the dhruv/jupyter-magic-syntax branch from 7509cd4 to 94fafb5 Compare July 10, 2023 04:58

dhruvmanila added 6 commits July 13, 2023 19:29

Consider line continuation character while lexing

73d3cda

Add tests, update docs

740df56

Update tests

1ce7f8c

More tests

0131560

Capture the magic kind in the token

85ce0dd

Move determining MagicKind to allow NonLogicalNewline

2279edc

dhruvmanila marked this pull request as ready for review July 14, 2023 05:56

dhruvmanila requested a review from MichaReiser July 14, 2023 05:56

MichaReiser reviewed Jul 14, 2023

View reviewed changes

dhruvmanila added 8 commits July 14, 2023 17:58

Add detailed docs for Jupyter magics with links

3a322cc

Handle line continuation + EOL, add test cases

81cb503

Update return type as function doesn't error

f8f8231

Revert back the line continuation + eol impl

bef6280

Fix infinite loop when escape character is last in file

cc05c04

Add lexer test cases for Jupyter magics

3cce7e4

Avoid indentation reset after lexing Jupyter magics

a2961a0

Use feature flag for NonLogicalNewline token

f113d7b

dhruvmanila commented Jul 15, 2023

View reviewed changes

dhruvmanila requested a review from MichaReiser July 15, 2023 15:07

MichaReiser approved these changes Jul 17, 2023

View reviewed changes

dhruvmanila added 2 commits July 17, 2023 20:40

Use functions instead of macro for readability

aa79d3d

Fix panic on unwrap when skipping the line continuation

a0812df

dhruvmanila merged commit 3b4c8ff into main Jul 18, 2023

dhruvmanila deleted the dhruv/jupyter-magic-syntax branch July 18, 2023 03:54

This was referenced Jul 26, 2023

Add ability to handle magic commands in Jupyter notebook astral-sh/ruff#5030

Closed

Complete Jupyter notebook integration astral-sh/ruff#5188

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Lex Jupyter line magic with `Mode::Jupyter` #23

Lex Jupyter line magic with `Mode::Jupyter` #23

dhruvmanila commented Jul 5, 2023 •

edited

Loading

dhruvmanila commented Jul 5, 2023

MichaReiser left a comment

MichaReiser Jul 14, 2023

dhruvmanila Jul 14, 2023

MichaReiser Jul 14, 2023 •

edited

Loading

dhruvmanila Jul 15, 2023

dhruvmanila Jul 15, 2023

MichaReiser Jul 17, 2023

dhruvmanila Jul 17, 2023

MichaReiser Jul 17, 2023

MichaReiser Jul 17, 2023

MichaReiser Jul 17, 2023

dhruvmanila Jul 17, 2023

MichaReiser Jul 17, 2023

dhruvmanila Jul 18, 2023

MichaReiser Jul 17, 2023

dhruvmanila Jul 17, 2023

MichaReiser commented Jul 17, 2023

Lex Jupyter line magic with Mode::Jupyter #23

Lex Jupyter line magic with Mode::Jupyter #23

Conversation

dhruvmanila commented Jul 5, 2023 • edited Loading

Footnotes

dhruvmanila commented Jul 5, 2023

MichaReiser left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

MichaReiser Jul 14, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

MichaReiser commented Jul 17, 2023

Lex Jupyter line magic with `Mode::Jupyter` #23

Lex Jupyter line magic with `Mode::Jupyter` #23

dhruvmanila commented Jul 5, 2023 •

edited

Loading

MichaReiser Jul 14, 2023 •

edited

Loading