Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Should a local file always be re-installed on 'pip install' (with or without '--upgrade')? #8711

Closed
gaborbernat opened this issue Aug 5, 2020 · 69 comments
Labels
C: direct url Direct URL references (PEP 440, PEP 508, PEP 610) state: needs discussion This needs some more discussion type: deprecation Related to deprecation / removal.
Milestone

Comments

@gaborbernat
Copy link

gaborbernat commented Aug 5, 2020

Without the resolver pip always reinstalled source distributions that it got as a direct command-line argument. Such is the case of tox, that builds the sdist, and then invokes pip to install it. In a pip with resolver world now, pip seems to no longer do this but rather first check the version number to see if it's not already installed:

  Getting requirements to build wheel ... done
    Created temporary directory: /private/var/folders/kt/btg285ds5kx4l1k398lb7df80000gr/T/pip-modern-metadata-v95a7j6j
    Running command /tmp/magic/.tox/test/bin/python /tmp/magic/.tox/test/lib/python3.8/site-packages/pip/_vendor/pep517/_in_process.py prepare_metadata_for_build_wheel /var/folders/kt/btg285ds5kx4l1k398lb7df80000gr/T/tmpqp55wn5c
    running dist_info
    creating /private/var/folders/kt/btg285ds5kx4l1k398lb7df80000gr/T/pip-modern-metadata-v95a7j6j/magic.egg-info
    writing /private/var/folders/kt/btg285ds5kx4l1k398lb7df80000gr/T/pip-modern-metadata-v95a7j6j/magic.egg-info/PKG-INFO
    writing dependency_links to /private/var/folders/kt/btg285ds5kx4l1k398lb7df80000gr/T/pip-modern-metadata-v95a7j6j/magic.egg-info/dependency_links.txt
    writing entry points to /private/var/folders/kt/btg285ds5kx4l1k398lb7df80000gr/T/pip-modern-metadata-v95a7j6j/magic.egg-info/entry_points.txt
    writing requirements to /private/var/folders/kt/btg285ds5kx4l1k398lb7df80000gr/T/pip-modern-metadata-v95a7j6j/magic.egg-info/requires.txt
    writing top-level names to /private/var/folders/kt/btg285ds5kx4l1k398lb7df80000gr/T/pip-modern-metadata-v95a7j6j/magic.egg-info/top_level.txt
    writing manifest file '/private/var/folders/kt/btg285ds5kx4l1k398lb7df80000gr/T/pip-modern-metadata-v95a7j6j/magic.egg-info/SOURCES.txt'
    reading manifest file '/private/var/folders/kt/btg285ds5kx4l1k398lb7df80000gr/T/pip-modern-metadata-v95a7j6j/magic.egg-info/SOURCES.txt'
    writing manifest file '/private/var/folders/kt/btg285ds5kx4l1k398lb7df80000gr/T/pip-modern-metadata-v95a7j6j/magic.egg-info/SOURCES.txt'
    creating '/private/var/folders/kt/btg285ds5kx4l1k398lb7df80000gr/T/pip-modern-metadata-v95a7j6j/magic.dist-info'
    Preparing wheel metadata ... done
  Source in /private/var/folders/kt/btg285ds5kx4l1k398lb7df80000gr/T/pip-req-build-h298ahlt has version 1.4.1, which satisfies requirement magic==1.4.1 from file:///tmp/magic/.tox/.tmp/package/1/magic-1.4.1.tar.gz
  Removed magic==1.4.1 from file:///tmp/magic/.tox/.tmp/package/1/magic-1.4.1.tar.gz from build tracker '/private/var/folders/kt/btg285ds5kx4l1k398lb7df80000gr/T/pip-req-tracker-5awzmelt'
Removed build tracker: '/private/var/folders/kt/btg285ds5kx4l1k398lb7df80000gr/T/pip-req-tracker-5awzmelt'

For local development projects, the version number though is highly untrustworthy, and the records/their contents are different even if the version number is the same. You generally only change versions at releases. We should either always re-install sdists/wheels passed in as paths or ensure that the content of the installed and passed are byte identical...

PS. tox/users could pass --force-reinstall to force the operation I guess, but this would be a breaking feature from POV of pip users.

@triage-new-issues triage-new-issues bot added the S: needs triage Issues/PRs that need to be triaged label Aug 5, 2020
@pradyunsg pradyunsg changed the title local development and pip resolver local development and pip's 2020 resolver Aug 5, 2020
@sbidoul
Copy link
Member

sbidoul commented Aug 5, 2020

This relates to #5780 and the upgrade strategies for direct URLs.

@sbidoul sbidoul added C: direct url Direct URL references (PEP 440, PEP 508, PEP 610) C: new resolver labels Aug 5, 2020
@triage-new-issues triage-new-issues bot removed the S: needs triage Issues/PRs that need to be triaged label Aug 5, 2020
@sbidoul sbidoul added the state: needs discussion This needs some more discussion label Aug 5, 2020
@uranusjr
Copy link
Member

uranusjr commented Aug 5, 2020

Agreed. More specifically, the old resolver uses different logic for PEP 508 and legacy requirements on whether it should re-install a URL requirement. The new one tries to be consistent, but that consistent logic needs refining. I think the conclusion in #5780 is to

  • Always re-install if the URL changes (possible with PEP 610)
  • Use the existing installation if the URL does not change, and the package version matches
  • Re-install if the package version changes, even if the URL is the same

I’m not sure what parts of the URL PEP 610 records; if it contains the query string part, tools like tox should be able to force package re-installation with general cache-busting techniques.

@sbidoul
Copy link
Member

sbidoul commented Aug 5, 2020

And let's not forget editables, which are always re-installed :)

@gaborbernat
Copy link
Author

I’m not sure what parts of the URL PEP 610 records; if it contains the query string part, tools like tox should be able to force package re-installation with general cache-busting techniques.

Can you detail how this would look?

Use the existing installation if the URL does not change, and the package version matches

Note this use falls under this section which is not a solution for this issue and users will be surprised by this change in behaviour.

@uranusjr
Copy link
Member

uranusjr commented Aug 6, 2020

The most widely-used generic cache busting approach simply appends a hash to the query string, so you’d do something like

pip install "package_name @ file://path/to/file.whl?tox=1234abcd"

where the tox= value is machine generated and changes every time (e.g. build timestamp, hash, or UUID). This would make the URL change, without actually changing the thing it points to.

@gaborbernat
Copy link
Author

That sounds something a tool could do (though all tooling would need to change how to install packages), but not that useful for humans 🤔

@uranusjr
Copy link
Member

uranusjr commented Aug 6, 2020

My understanding is that pip doesn’t (never did?) position itself as a development tool, and is expected to be used in conjuction of other tooling to work well in such settings.


pip’s current behaviour (always re-installs bare URLs, but never re-installs PEP 508 URL specs) is entirely arbitrary without justification. My understanding is that the situation is caused by historic oversight, and should be classified as a bug. But it is also not obvious which behaviour is “corerct”, and there are two camps of people wanting the opposite approach. The resolution in #5780 is most reasonable middle ground (which actually makes sense!) that I know. I am fully aware that it would break someone’s workflow no matter what we do, but I still believe it’s something that should be done.

@gaborbernat
Copy link
Author

How would maintaining status quo break peoples workflow? 😊

@pradyunsg
Copy link
Member

pradyunsg commented Aug 6, 2020

always re-installs bare URLs

I've not read this issue yet, but this behaviour is definitely intentional: #4764 + #536

never re-installs PEP 508 URL specs

🤷

@gregcouch
Copy link

This a big change for local development. During local development, we build the wheel, then pip install --upgrade it. The version number does not change during the local development cycle. Adding --use-feature=2020-resolver prevents the "upgrade" from happening. --force-reinstall is overkill because it reinstalls all of the dependencies (and actually doesn't work when some of the dependencies are not (yet) on pypi). --no-deps only works if all of the dependencies have already been installed, ie., don't change during development.

If --force-reinstall could be limited to the packages on the command line, which is what I would expect, then that would be a reasonable solution.

@uranusjr
Copy link
Member

uranusjr commented Aug 6, 2020

If --force-reinstall could be limited to the packages on the command line, which is what I would expect, then that would be a reasonable solution.

This is currently possible with --force-reinstall --no-deps (with the caveat that it also won’t fix your dependencies if they need upgrading). Maybe we should have something like --upgrade-strategy for this.

@pradyunsg
Copy link
Member

pradyunsg commented Aug 6, 2020

I think local directories/files passed directly via the CLI (like pip install . or "pip install file) are a special case where we should unconditionally reinstall those.

I do agree that all other URL and package @ URL should be treated consistently - the fact that they differ was a quality of implementation issue.

@uranusjr
Copy link
Member

uranusjr commented Aug 6, 2020

I believe both the old and new resolvers do directory and VCS installations consistently. The inconsistent part is when they’re passed a URL to sdist or wheel, so #4764 and #536 don’t apply.

@uranusjr
Copy link
Member

uranusjr commented Aug 6, 2020

I thought about this a bit more; maybe we should just always reinstall a URL, no matter if it’s PEP 508, no matter whether it’s editable, and no matter whether it points to a directory, sdist, wheel, or VCS. The not-reinstall-URL thing is an optimisation anyway, and by always reinstalling a URL-specified requirements, we also give the user looking for optimisations use indexes instead, with which pip can offer even more optimisations than reinstallation.

@sbidoul
Copy link
Member

sbidoul commented Aug 6, 2020

always reinstall a URL

That can't work, it will raise a load of (performance) issues for people who actually need them.
I for one spent time, e.g. implementing caching of immutable VCS URLs for a reason.

Asking people to use to indexes won't make it either. One reason is that using indexes requires generating unique version numbers which is a burden when, e.g. you want to install an unmerged upstream PR for some dependency.

Always reinstalling local directories sounds better to me.

Perhaps always reinstalling local files too, or not.

A more targeted --force-reinstall variant would be interesting to explore too.

What is actually the use case for installing from a wheel or sdist in a development workflow?
Would there be an issue with "pip install localdir" (which builds a wheel anyway before installing)?

@gregcouch
Copy link

An issue with "pip install localdir" is how to pass arguments to setup.py (e.g., --no-user-cfg, --debug). It is much simpler to keep the building and the installing of the wheels separate.

@brainwane
Copy link
Contributor

@uranusjr said:

I am torn on 8711; I want to do something about it before the resolver becomes the default, but that is a very involved issue that may delay the schedule a lot.

But while speaking with @pradyunsg yesterday I learned that, according to Pradyun, this issue is resolved. Pradyun, could you clarify?

@uranusjr
Copy link
Member

Oops, I messed up the issue number, sorry. The complicated issue that may delay the rollout I had in mind is #8785, not this one.

@uranusjr
Copy link
Member

I don’t think this one is resolved yet though? As mentioned above, this is related to #5780, and we need to make a decision on that one. Although we probably can close this one and only track #5780 (which contains a more concrete plan).

@pfmoore
Copy link
Member

pfmoore commented Nov 20, 2020

I'm getting confused over the different suggestions, so apologies, but are you saying that we shouldn't reinstall on any sort of archive file, and only reinstall unpacked source trees (local directories)? That's not going to address the tox case that was the original subject of this issue, but I'm not against it...

@sbidoul
Copy link
Member

sbidoul commented Nov 20, 2020

@pfmoore ...or reinstall all sorts, I honestly don't know. All I'm saying so far is that I think the behavior should be same for file:// and https://, probably. To say anything more conclusive I'm afraid I'd need to do the full research on the table I posted... and just like you I don't have the bandwidth to do it correctly these days. Maybe someone has a clearer picture of the whole situation than me, I don't know. In an ideal world we'd need to do that research first, to know the as-is and wanted to-be situation and discuss a proper transition. But I guess there is the pressure to release the new resolver so 🤷 ...

Now, if anyone wants more of my instinct, I'd say @uranusjr's proposal at #5780 (comment) looks promising (basically reinstall only if url changed compared to direct_url.json), but may require a new --force-reinstall-but-not-dependencies option to resolve this issue that tox raised without performance impact.

@brainwane brainwane added this to the 20.3 milestone Nov 20, 2020
@gregcouch
Copy link

gregcouch commented Nov 21, 2020 via email

@pfmoore
Copy link
Member

pfmoore commented Nov 21, 2020

Thanks for the data point @gregcouch - why is --force-reinstall (possibly with --no-deps) not sufficient for you?

@pradyunsg
Copy link
Member

A few notes, after good night's sleep and re-reading this issue:

That's why my instinct tells me https:// vs file:// should not trigger different behaviors.

That matches my instinct as well. :)

The distinction to me is that #5780 is asking for a behavior change in how we handle URLs in general, toward making it more consistent. This issue is about making the new resolver behave like the old resolver w.r.t. local directories and files.

In an ideal world we'd need to do that research first, to know the as-is and wanted to-be situation and discuss a proper transition.

YES! Exactly. We could've used this resolver change to also make progress on #5780, but we just don't have the time/dev resource available right now to properly consider all-the-things and make those changes as we probably want to.

I think #9147 gets us mostly where we might want to be, in terms of this issue. The behaviors for local files and directories would be unchanged between the old vs new resolver (as far as I can tell), including the behaviors for wheels, albeit with more informative messages.

I honestly don't know.

I don't either! That's the whole reason behind me falling back to "let's just do what the old resolver does and revisit this when we have more time to think through this, instead of unintentionally breaking certain workflows". The only thing different is that there's a new deprecation, toward changing the sdist reinstall behavior. :)

I guess there is the pressure to release the new resolver

There is, definitely. I start a non-pip full-time role on Monday, which will affect my availability. Plus, we've already had fairly substantial delays for various other reasons, so I'm not too keen on waiting until after we have an extended discussion on this to move forward with the release. :)

Now, if anyone wants more of my instinct, I'd say @uranusjr's proposal at #5780 (comment)

And, I agree with this. I tried implementing this, but it quickly got a little more messy and I realized it was easier to replicate the old resolver's behavior than to deviate from it.


My group needs the reinstall behavior for wheels.

Could you elaborate a little further on this? Based on my testing here, even the old resolver did not reinstall wheels if the versions match. Either you are asking for behavior that wasn't in the old resolver or I am missing something. :)

pip install --no-deps --use-deprecated=legacy-resolver $wheel
Requirement already satisfied: furo==2020.11.19b18 from file:///Users/pradyunsg/Projects/furo/dist/furo-2020.11.19b18-py3-none-any.whl in /Users/pradyunsg/.virtualenvs/pip/lib/python3.8/site-packages (2020.11.19b18)

vs

pip install --no-deps ../furo/dist/furo-2020.11.19b18-py3-none-any.whl
Processing /Users/pradyunsg/Projects/furo/dist/furo-2020.11.19b18-py3-none-any.whl
furo is already installed with the same version as the provided wheel. Use --force-reinstall to force an installation of the wheel.

@pradyunsg
Copy link
Member

puts on release manager hat

I'd like to get #9147 merged to unblock us on one of the two major blockers for the main 20.3 release. If there are any concerns around how the new resolver's behavior post #9147 deviates from the legacy resolver, please flag them there.

@sbidoul
Copy link
Member

sbidoul commented Nov 21, 2020

I just did a quick test with 20.2.4:

pip install ./packaging-20.4-py2.py3-none-any.whl does not reinstall
pip install "packaging @ file://$PWD/packaging-20.4.tar.gz" does not reinstall
pip install ./packaging-20.4.tar.gz reinstalls

#9147 does this:

pip install ./packaging-20.4-py2.py3-none-any.whl does not reinstall
pip install "packaging @ file://$PWD/packaging-20.4.tar.gz" reinstalls
pip install ./packaging-20.4.tar.gz reinstalls

So that does not quite reproduce what the old resolver did.

It looks like the behaviour of 20.2.4 is kind of a side effect of pip needing to build to find the package name.

@pradyunsg
Copy link
Member

We don't need bug for bug compatibility IMO - just something that's not obviously borked. 😅

I'm inclined to say that the behaviour of PEP 508 vs non-PEP 508 situations to be a bit of a bug-that-is-difficult-to-recreate, and making it consistent on that front right now is A-OK.

@sbidoul
Copy link
Member

sbidoul commented Nov 22, 2020

bug for bug for compatibility

Sure, but what is the bug in the first place ? That's not obvious to me. Perhaps the bug is that it did reinstall at all.

@gregcouch
Copy link

gregcouch commented Nov 23, 2020 via email

@bluenote10
Copy link

bluenote10 commented Nov 28, 2020

My group needs the reinstall behavior for wheels.

Same in my team, and why I have raised #9116.

Thanks for the data point @gregcouch - why is --force-reinstall (possibly with --no-deps) not sufficient for you?

For our developers the workflow is to re-run pip install -e whenever they have made "install related" changes to a local package, which in our case typically include:

  • modifications to entry points
  • changes to Cython modules which require a re-compile

Think of it: The only reason to re-run pip install -e on a local package is to pick up the change. I'm currently teaching our team to always use pip install --force -e instead. If a command line option like --force is always required, the default behavior clearly doesn't seem very useful.

EDIT: According to @pfmoore's comment I may have misunderstood the issue description.

@pradyunsg pradyunsg removed this from the 20.3 milestone Nov 28, 2020
@chrisg123
Copy link

chrisg123 commented Dec 4, 2020

I was pointed here by the deprication warning:
DEPRECATION: Source distribution is being reinstalled despite an installed package having the same name and version as the installed package. pip 21.1 will remove support for this functionality. A possible replacement is use --force-reinstall. You can find discussion regarding this at https://github.com/pypa/pip/issues/8711.

Currently on pip 20.3.1 and running pip install <pkg_from_local_file> does exactly what I need.
It will always reinstall the specified package and only install dependencies if required.

How might I achieve the same behavior once on pip 21.1 ? the --force-reinstall seems to force dependencies to reinstall as well.

I tried reading through the comments but it is not very clear. The deprecation warning however seems very clear, what I need will stop working with pip 21.1 so It would be nice to get ahead of this.

@tgs
Copy link
Contributor

tgs commented Dec 4, 2020

You asked for feedback, so I'll tell you what I think...

I normally expect pip install /specific/file to either install that file or fail with an error. On the other hand, I expect pip install package-name not to do anything if the package is already there at the appropriate version. I can't think of a time when I'd say pip install /specific/file and not want pip to install that file...

Having the behavior change depending on whether a file is a wheel or an sdist is surprising --- to me, file name vs package name is a way more salient distinction. But I've never encountered that quirk before, so maybe it doesn't matter.

I guess it's fine if my mental model was wrong... but if the install /specific/sdist behavior is changing, then I agree with others that we still need an easy command for "definitely try to install this file, but only reinstall its dependencies if their versions have changed"

@rbalik
Copy link

rbalik commented Dec 9, 2020

My company relies on things NOT reinstalling as an optimization in our production build pipeline and it seems that this behavior is different now, I assume because of this issue.
The specific case for us is:

  1. Install a specific internal package by cloning to a directory and then doing pip install <directory>
  2. Later on, we run pip install -r requirements.txt. requirements.txt contains the library installed in step 1 (via a URL to a private git repo), but this is skipped because it was already installed.

Now, pip seems to want to install the library again in step 2. Is there anyway to avoid this behavior now?

@uranusjr uranusjr changed the title local development and pip's 2020 resolver Should a local file always be re-installed on 'pip install' (with or without '--upgrade')? Mar 7, 2021
@sbidoul sbidoul mentioned this issue Apr 24, 2021
@sbidoul sbidoul added this to the 21.2 milestone Apr 24, 2021
@sbidoul sbidoul added the type: deprecation Related to deprecation / removal. label Apr 25, 2021
@uranusjr
Copy link
Member

Merging this into #5780 (comment)

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
C: direct url Direct URL references (PEP 440, PEP 508, PEP 610) state: needs discussion This needs some more discussion type: deprecation Related to deprecation / removal.
Projects
None yet
Development

No branches or pull requests