Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Finding references analyzes excluded files #123

Closed
jahan01 opened this issue Jul 16, 2020 · 22 comments
Closed

Finding references analyzes excluded files #123

jahan01 opened this issue Jul 16, 2020 · 22 comments
Assignees
Labels
needs investigation Could be an issue - needs investigation

Comments

@jahan01
Copy link

jahan01 commented Jul 16, 2020

Environment data

  • Language Server version: 2020.7.2
  • OS and version: Mac 10.15.5
  • Python version (& distribution if applicable, e.g. Anaconda): Anaconda 3.6

Expected behaviour

When Find references is triggered search is only done files which not are excluded in the workspace

Actual behaviour

When Find references is triggered scanning is also done on files which are excluded by files.exclude

@erictraut
Copy link
Contributor

Thanks for the suggestion.

The files.exclude setting is designed to hide files in the file browser within VS Code. I don't think it's intended to exclude files from language services. Other language service providers don't appear to honor the files.exclude setting. For example, the TypeScript LS returns references in files that are hidden by files.exclude. When you click on such a reference, VS Code temporarily "unhides" the file in the left panel and opens the file in a tab.

Could you explain your use case a bit more? Perhaps we can offer another way to accomplish what you're trying to do.

@erictraut erictraut added the waiting for user response Requires more information from user label Jul 16, 2020
@jahan01
Copy link
Author

jahan01 commented Jul 16, 2020

@erictraut Actually in my use case there are no references in the folders excluded. There couple of thousands python files in those excluded folders and find references just takes forever to scan them. In fact I was not able see any reference in this particular project even waiting for good 15 min.

Jedi or old MS Python language server doesn't have this issue.

It could be the case of slow scanning or having to scan the entire workspace for everytime when find references.

Anyways feel free to close this if you are not intending to support it.

@erictraut
Copy link
Contributor

erictraut commented Jul 16, 2020

OK, thanks for the additional details. Let's see if we can figure out what's causing the performance issue. Even with thousands of python source files, I wouldn't expect "find references" to take more than a few seconds. So something is going on here that we should understand better.

Could you look in the Output window and paste logs that might be of relevance? Here are the instructions from the bug template:

  • Select "View: Toggle Output" from the command palette (Ctrl+Shift+P on Windows/Linux, Command+Shift+P on macOS), then select "Python Language Server" in the dropdown on the right. Look for the line Pylance Language Server version X in the console.
  • Enable trace logging by adding "python.analysis.logLevel": "Trace" to your settings.json configuration file.
  • Adding this will cause a large amount of info to be printed to the Python output panel. This should not be left long term, as the performance impact of the logging is significant.

@jakebailey
Copy link
Member

FWIW MPLS does not respect this value for references either, only for workspace symbol searches. So I'd also be interested in finding the bug that's making it scan too heavily as well.

@jahan01
Copy link
Author

jahan01 commented Jul 16, 2020

@erictraut: actually it turns out I have way more than couple of thousands. its around 4-6K python source files and analysis gets stuck around 10575th line in the log (there approximately 3 lines in log per file - parsing, binding, in some cases checking and analyzing).

Most of file gets analysed under 10ms, with few files taking more than 2000ms. I can pull up the histogram if you are really interested.

We are sort of working in "mono" repo, where in the excluded folder contains legacy code and site packages, which is never imported by current workspace code.

FWIW MPLS does not respect this value for references either, only for workspace symbol searches. So I'd also be interested in finding the bug that's making it scan too heavily as well.

If I understand correctly old LS builds indexes on based on open python file and recursively navigates to sources that are reachable. If I open a code which was not reachable by the 1st file, I can see indexes building again. So the code in excluded folder is never indexed. And references are always pulled from this index, not scanned on every request. Anyways you guys should know it best.

@erictraut
Copy link
Contributor

My team also has a mono repo with somewhere between 3-4K python files, and "find references" completes after several seconds. It sounds like your source base is slightly bigger, but that shouldn't be a problem.

My hypothesis is that there's a particular file in your source base that is triggering a bug in the analyzer — something that breaks a performance or scalability assumption. If you look at your logs, does it appear to always stop after the same file? You mentioned that it stopped after the 10575th log line. Is that consistently the same file? If so, is there something unusual or unique about the contents of that file?

Thanks for your help with this!

@jahan01
Copy link
Author

jahan01 commented Jul 16, 2020

Sorry, 10575 seems inconsequential. Line at which logs are stuck is always the file from where the find reference request is triggered. The scan seems to running chronologically, so files after this aren't being analysed.

If I remove excluded folder from disk, the scan completes quickly and results are displayed.
I tested by triggering references from multiple files, the pattern repeats. Always stuck at file where request is triggered and works when the "excluded" folder is removed from workspace. So I don't think one weird file is causing this.

If you think repo is too big, for starters, how about providing a setting to exclude folders/globs from scanning ?

@erictraut
Copy link
Contributor

There is a mechanism for excluding folders/globs, but it's not currently exposed as a VS Code setting. You would need to use the pyrightconfig.json config file, which is documented here. Go to the bottom of that page for an example that shows how to use the "exclude" config entry.

@jahan01
Copy link
Author

jahan01 commented Jul 16, 2020

thanks @erictraut, after exclusion finding references is working perfectly fine

@ThiefMaster
Copy link

Having the build dir included is very annoying. It's basically a duplicate of my code and a place I NEVER want to look at. Sure, I could delete it after building a wheel, but that's not the point...

image

@erictraut
Copy link
Contributor

@ThiefMaster, have you added a "pyrightconfig.json" config file and added your build directory to the "exclude" section as suggested above? That should properly exclude it from symbol lookups. Let us know if it doesn't.

@ThiefMaster
Copy link

didn't notice that comment, will give it a try!

@alx00x
Copy link

alx00x commented Dec 18, 2020

Just to confirm, adding excludes to pyrightconfig.json actually works.

This has been a major pain for me for quite some time and I've been waiting for vscode to resolve the issue here: microsoft/vscode#46718

Thank you @erictraut for pointing this out! 👍 This makes a huge difference for large codebases.

@SowingSadness
Copy link

SowingSadness commented Jan 24, 2021

@erictraut Find References in all others languages works according with search.exclude
LocalHistory plugin save all versions same file in .history folder in root of project.

TypeScript example:
image

Pylance example:
image

@untrix
Copy link

untrix commented Feb 9, 2021

pyrightconfig.json works, thanks.

However, this feature is certainly needed IMO in vscode-settings. In my use-case there is a run-logs folder to which wandb (weights and biases) copies a few python files when I execute a ML training run as a snapshot of the code that was run (it also saves diffs of files not committed to git). Over time, the folder ends up with 10s or 100s of run directories. However, it is in my .gitignore file, so these do not get checked into git. However, when I execute find-references, references from all these run folders show up as well and many times I inadvertently end up editing these files. I modified all possible 'exclude' settings in both User and Workspace, but those references kept showing up. Finally, setting up a pyrightconfig.json in the workspace folder worked but it would be much better if the setting were exposed in the main IDE or within pylance.

@ehendrix23
Copy link

Just want to add I ran into same issue. Would be good to have this within settings IDE.

@acnebs
Copy link

acnebs commented Mar 9, 2021

+1 to exposing the pyright config in VSCode settings, maybe like this:

{
  "python.analysis.pyrightConfig": {
    "exclude": [
      "**/dir"
    ]
}

Would like to be able to exclude different directories in different VSCode workspaces.

@casassg
Copy link

casassg commented Mar 12, 2021

Same here, would be nice to have a way to surface this setting. Respecting search.exclude would also work! Due to monorepo, currently it's impossible to use PyLance (trying to use the pyrightconfig.json to see if I can manage it to make it work)

@nhjk
Copy link

nhjk commented Mar 31, 2021

Adding excludes to pyrightconfig.json worked for me. For those who have venv's in their project directory, make sure to add that as well, otherwise find references will take forever. Pylance excludes venv directories by default, but when you add your own directories it doesn't merge them with pylance's defaults.

Edit: I agree with casassg that respecting search.exclude (and possibly merging with playnce's default exclude directories) would be nice. When I first stumbled across this issue adding the directory to search.exclude was the first thing I did. If that worked I would have resolved this issue in 5 mins instead of...

@picrots
Copy link

picrots commented Jul 7, 2021

Fantastic!
Using pyrightconfig.json file and monitoring trace logs for Python Language Server helped me find the directory that causes the issue.
When I excluded that folder it becomes fast again.
The weird part is that trying to narrow down the search to find the exact file that cause the issue was unsuccessful. Maybe the issue is not caused by single file but a combination of conditions that certain files create.

@jakebailey
Copy link
Member

It seems as though everything has converged here on exclude being the option that fixes things, and that we should get that into a VS Code setting. That's #1150, so I'm going to close this issue in favor of it.

@sv158
Copy link

sv158 commented Nov 5, 2021

My problem is a bit different. When I jump to the declaration in the module of Python standard library, it still checked type for it. I tried to exclude both related path and absolute path in search.exclude, pyright's exclude and files.watcherExclude, and none of them worked.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs investigation Could be an issue - needs investigation
Projects
None yet
Development

No branches or pull requests