Add fast-path for comment detection #9808

charliermarsh · 2024-02-03T16:22:26Z

Summary

When we fall through to parsing, the comment-detection rule is a significant portion of lint time. This PR adds an additional fast heuristic whereby we abort if a comment contains two consecutive name tokens (via the zero-allocation lexer). For the ctypeslib.py, which has a few cases that are now caught by this, it's a 2.5x speedup for the rule (and a 20% speedup for token-based rules).

codspeed-hq · 2024-02-03T16:45:12Z

CodSpeed Performance Report

Merging #9808 will improve performances by 4.84%

_{Comparing charlie/eradicate (c63bc5e) with main (b47f85e)}

Summary

⚡ 2 improvements
✅ 28 untouched benchmarks

Benchmarks breakdown

	Benchmark	`main`	`charlie/eradicate`	Change
⚡	`linter/all-with-preview-rules[numpy/ctypeslib.py]`	24.4 ms	23.2 ms	+4.84%
⚡	`linter/all-rules[numpy/ctypeslib.py]`	21.7 ms	20.7 ms	+4.44%

crates/ruff_linter/src/rules/eradicate/detection.rs

MichaReiser · 2024-02-04T11:14:28Z

crates/ruff_python_trivia/src/tokenizer.rs

@@ -182,7 +182,7 @@ fn to_keyword_or_other(source: &str) -> SimpleTokenKind {
        "case" => SimpleTokenKind::Case,
        "with" => SimpleTokenKind::With,
        "yield" => SimpleTokenKind::Yield,
-        _ => SimpleTokenKind::Other, // Potentially an identifier, but only if it isn't a string prefix. We can ignore this for now https://docs.python.org/3/reference/lexical_analysis.html#string-and-bytes-literals
+        _ => SimpleTokenKind::Name, // Potentially an identifier, but only if it isn't a string prefix. We can ignore this for now https://docs.python.org/3/reference/lexical_analysis.html#string-and-bytes-literals


How can we avoid returning a Name for a string prefix?

github-actions · 2024-02-04T21:51:03Z

`ruff-ecosystem` results

Linter (stable)

ℹ️ ecosystem check encountered linter errors. (no lint changes; 1 project error)

sphinx-doc/sphinx (error)

ruff failed
  Cause: Selection of unstable rules without the `--preview` flag is not allowed. Enable preview or remove selection of:
	- FURB113
	- FURB131
	- FURB132

Linter (preview)

✅ ecosystem check detected no linter changes.

Formatter (stable)

ℹ️ ecosystem check encountered format errors. (no format changes; 2 project errors)

sphinx-doc/sphinx (error)

ruff format --no-preview --exclude tests/roots/test-pycode/cp_1251_coded.py

ruff failed
  Cause: Selection of unstable rules without the `--preview` flag is not allowed. Enable preview or remove selection of:
	- FURB113
	- FURB131
	- FURB132

openai/openai-cookbook (error)

warning: Detected debug build without --no-cache.
error: Failed to parse examples/dalle/Image_generations_edits_and_variations_with_DALL-E.ipynb:3:7:8: Unexpected token 'prompt'

Formatter (preview)

ℹ️ ecosystem check encountered format errors. (no format changes; 1 project error)

openai/openai-cookbook (error)

ruff format --preview

warning: Detected debug build without --no-cache.
error: Failed to parse examples/dalle/Image_generations_edits_and_variations_with_DALL-E.ipynb:3:7:8: Unexpected token 'prompt'

MichaReiser

I don't feel comfortable having such an important constraint in an inline comment that violates the basic properties of SimpleTokenizer. We should explore if we can support proper name lexing in SimpleTokenizer without degrading performance.

crates/ruff_python_trivia/src/tokenizer.rs

charliermarsh force-pushed the charlie/eradicate branch from 905202d to dd1fe54 Compare February 3, 2024 16:38

MichaReiser reviewed Feb 4, 2024

View reviewed changes

charliermarsh force-pushed the charlie/eradicate branch 6 times, most recently from 03a1844 to a78aec3 Compare February 4, 2024 21:32

charliermarsh marked this pull request as ready for review February 4, 2024 21:37

charliermarsh requested a review from MichaReiser February 4, 2024 21:37

charliermarsh added the performance Potential performance improvement label Feb 4, 2024

MichaReiser requested changes Feb 5, 2024

View reviewed changes

crates/ruff_python_trivia/src/tokenizer.rs Outdated Show resolved Hide resolved

charliermarsh requested a review from MichaReiser February 5, 2024 13:44

MichaReiser approved these changes Feb 5, 2024

View reviewed changes

crates/ruff_python_trivia/src/tokenizer.rs Outdated Show resolved Hide resolved

crates/ruff_python_trivia/src/tokenizer.rs Outdated Show resolved Hide resolved

charliermarsh force-pushed the charlie/eradicate branch from a78aec3 to 03dcbf5 Compare February 5, 2024 15:49

Add fast-path for comment detection

c63bc5e

charliermarsh force-pushed the charlie/eradicate branch from 03dcbf5 to c63bc5e Compare February 5, 2024 15:50

charliermarsh merged commit 9781563 into main Feb 5, 2024
17 checks passed

charliermarsh deleted the charlie/eradicate branch February 5, 2024 16:00

miccal mentioned this pull request Feb 6, 2024

ruff 0.2.1 Homebrew/homebrew-core#161912

Merged

bswck mentioned this pull request Mar 15, 2024

Bump ruff-pre-commit to v0.3.2 python-poetry/cleo#412

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add fast-path for comment detection #9808

Add fast-path for comment detection #9808

charliermarsh commented Feb 3, 2024 •

edited

Loading

codspeed-hq bot commented Feb 3, 2024 •

edited

Loading

MichaReiser Feb 4, 2024

github-actions bot commented Feb 4, 2024 •

edited

Loading

MichaReiser left a comment •

edited

Loading

Add fast-path for comment detection #9808

Add fast-path for comment detection #9808

Conversation

charliermarsh commented Feb 3, 2024 • edited Loading

Summary

codspeed-hq bot commented Feb 3, 2024 • edited Loading

CodSpeed Performance Report

Merging #9808 will improve performances by 4.84%

Summary

Benchmarks breakdown

MichaReiser Feb 4, 2024

Choose a reason for hiding this comment

github-actions bot commented Feb 4, 2024 • edited Loading

ruff-ecosystem results

Linter (stable)

Linter (preview)

Formatter (stable)

Formatter (preview)

MichaReiser left a comment • edited Loading

Choose a reason for hiding this comment

charliermarsh commented Feb 3, 2024 •

edited

Loading

codspeed-hq bot commented Feb 3, 2024 •

edited

Loading

github-actions bot commented Feb 4, 2024 •

edited

Loading

`ruff-ecosystem` results

MichaReiser left a comment •

edited

Loading