Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PLR0401 Cyclic import implementation #2914

Open
chanman3388 opened this issue Feb 15, 2023 · 12 comments
Open

PLR0401 Cyclic import implementation #2914

chanman3388 opened this issue Feb 15, 2023 · 12 comments
Labels
core Related to core functionality multifile-analysis Requires analysis across multiple files rule Implementing or modifying a lint rule

Comments

@chanman3388
Copy link
Contributor

Hello fellow ruffians(?)

I was thinking of having a stab at implement pylint's R0401 cyclic import rule, but before I set off I thought it might be wise to ask your opinions and to clarify my current understanding of how ruff works.

Would I be correct to say that, all the python files to be linted are collected, then each one is linted individually in parallel?
It seems the expected output from lint_path is a Result<Diagnostics>. If I wanted to track imports, I noticed there was an imports.rs module, but to the best of my knowledge this is called via linter::check_path which is also only run per file?

My initial thoughts were perhaps to change the APIs somewhat to return a Vec of the imports, create a graph and then detect cycles.

@charliermarsh
Copy link
Member

👍 Related to the question that was just asked here: #970 (comment).

Everything you've said above is correct. Right now, we use a single-file model, so every file is analyzed independently. We do some work upfront to determine the package hierarchy (e.g., to determine the package root for every file), which is the only part of the analysis pipeline that depends on "multiple files", so-to-speak.

(And yeah -- imports.rs is for enforcing rules like "Are the imports in this file sorted?". That logic does depend on being able to determine which imports are first-party, but that is done by looking at the contents of the package root as described above, so it doesn't depend on any deeper analysis than "If we import foo, does a foo directory or foo.py file exist in src?)

I think you're right that we want the linter pass to return some kind of "dependency graph" for each module, which should include the members that the module "exports" (implicitly, or explicitly via __all__), and the members that the module depends on. If we had that information, it would unlock tons of rules -- cyclic imports (although this relies on detecting whether the import is at the top-level, I assume, which requires tracking that metadata too), but also, "member not defined in module", etc. (although for third-party modules, that would get trickier).

We could start with a subset of that behavior, though, e.g., just the modules that a given module depends on, as you've described above. I'd be open to reviewing a PR or proposal to that effect!

@charliermarsh charliermarsh added rule Implementing or modifying a lint rule core Related to core functionality labels Feb 16, 2023
@chanman3388
Copy link
Contributor Author

Thanks for the feedback, sounds good! I've not been as free as I'd like but I'll come up with something on the weekend.

@chanman3388
Copy link
Contributor Author

chanman3388 commented Feb 26, 2023

Sorry I took so long to getting round to having a stab at this. Do you think it would make sense for me just to return the Located<U, T> Import and ImportFrom objects or should I be making some kind of custom type to house the full node path and its location?

@charliermarsh
Copy link
Member

How does the cyclic import rule work? Does it just detect cyclic imports at the top-level? I.e., are delayed or deferred imports (in function bodies) ignored?

I think either approach (returning AST nodes, or making a custom type) could work. We'll probably convert to some kind of internal representation when we do the cycle detection either way.

@chanman3388
Copy link
Contributor Author

I can confirm that pylint doesn't detect cyclic imports in delayed/deferred imports and only in top-level imports.

It also doesn't attribute the cyclic import to the correct module neither.
I made something which looked like this:

src
- __init__.py
- e
  - __init__.py
  - a.py (import src.f.a)
 - f
  - __init__.py
  - a.py (from src.g import a as b)
 - g
  - __init__.py
  - a.py (import src.e.a)

There are also the modules a-d too that I made with cyclic imports but what pylint throws out is:

************* Module src.c.__init__
src/c/__init__.py:1:0: R0401: Cyclic import (src.a -> src.b) (cyclic-import)
src/c/__init__.py:1:0: R0401: Cyclic import (src.e.a -> src.f.a -> src.g.a) (cyclic-import)
src/c/__init__.py:1:0: R0401: Cyclic import (src.c.a -> src.d.a) (cyclic-import)

Which I feel like is something we'd be able to avoid.

@chanman3388
Copy link
Contributor Author

Given a go at at least propagating the information on all the imports every module has in #3243

@Avasam
Copy link

Avasam commented Mar 25, 2023

Given Ruff's sheer speed, it has a good shot at implementing cyclic import detection even better than pylint and pyright !
pylint's cyclic-import and duplicated-code reports are inconsistant: pylint-dev/pylint#374
pyright still reports cyclic imports when said import is behind a TYPE_CHECKING check (aka a type-only import). microsoft/pyright#3489 (comment)

@charliermarsh
Copy link
Member

Oh interesting! Yeah we do track that info, we should be able to differentiate between typing-only imports when detecting cycles :)

@mat100payette
Copy link

Any update on this? It's one of the main things I'd love to have given that my tests sometimes fail on circular imports and having the fast linter spot these before running the tests would be optimal. Thanks!

@chanman3388
Copy link
Contributor Author

chanman3388 commented Jun 16, 2023

There's an attempt here #3880
Main issue I think is that it's quite different to the other rules as it needs information from other modules that are linted, which doesn't really vibe with how Ruff generally works.

@mat100payette
Copy link

Lovely! Good luck with that then. I wish I could help but unfortunately my Rust knowledge is non-existent :c

@sanmai-NL
Copy link

@charliermarsh CPython 3.12 now has low-overhead dynamic tracing, which could be used to trace imports and e.g., find cycles. https://docs.python.org/3.12/whatsnew/3.12.html#pep-669-low-impact-monitoring-for-cpython

@MichaReiser MichaReiser added the multifile-analysis Requires analysis across multiple files label Mar 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core Related to core functionality multifile-analysis Requires analysis across multiple files rule Implementing or modifying a lint rule
Projects
None yet
Development

No branches or pull requests

6 participants