-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
poetry install hugely slowed down by deepcopy operations #7459
Comments
The deepcopy operations are part of the solver and not an extraneous/non-core bit of Poetry. Work to improve performance is of course welcome (and you might have noticed from release notes that we have made significant strides there recently), but please note that "remove these expensive deepcopies" is probably not something that is possible with my understanding of the current architecture. Something like "evaluate how many copies are truly necessary vs. defensive in the solver" is much more tenable, but is also much more involved. |
I managed to create a reproducible project made of public dependencies, which you can find at this gist. Steps to reproduce:
The command is still |
Poetry is running the solver when it loads the LockfileRepository to verify the lock file is valid (e.g. doesn't have conflicting packages/will not result in an obvious nonsense solution). Invalid lock files are something that can happen e.g. from those who remove the hash and See #496 for more context on this. |
@neersighted I thank you for your incredible contribution to the project and recent improvements (really looking forward 1.4 btw!). As I understand that touching those deepcopies mustn't be the easiest thing ever, I was still wondering if there was some low hanging fruit since the slowdown happens when everything has already been solved. As I was writing this comment, I received your reply above :) I'm not really a fan of supporting incorrect usages of the |
I'll leave that to @radoering and @dimbleby; however, I don't think that's something we will accept upstream. The support burden of adding a knob that says "be less correct" when many users already subvert the protections we have in place (like the hash) would be massive, and it's not something any upstream developer would use or reason in terms of. |
re-solving is baked in to how In some sense it's best not to think of
Normally speaking this is fast because there are so few choices for each package in Of course merge requests bringing performance improvements are welcome... While I don't think there's enormous amounts of low-hanging fruit, I would bet that a sufficiently motivated person could make meaningful gains - and I could absolutely believe that all the |
This might be relevant: I'm writing If poetry workflow involves a scenario when the same object is copied multiple times, like so: for i in range(100):
mutate(deepcopy(data)) Replacing it with this would give you instant performance improvement: data_copier = duper.deepdups(data)
for i in range(100):
mutate(data_copier()) |
-vvv
option) and have included the output below.Issue
poetry install
takes considerable time (70 seconds) to run what should be a no-op. In fact:No dependencies to install or update
Although I cannot share the
pyproject.toml
andpoetry.lock
files as they contain private dependencies (so you wouldn't be able to reproduce it), I can share a profiling ofpoetry
that obtained running cProfile (link here). I'll share a few links that obtained plotting withsnakeviz
the result ofAs you can see lots of time seems to be spent in
deepcopy
operations inpoetry-core
.I'll try to create a fully reproducible use case. In the meantime I can also share that:
Thanks a lot for your support and for maintaining this amazing tool ❤️
The text was updated successfully, but these errors were encountered: