Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow to skip downloading manylinux wheels #3689

Open
sashkab opened this issue May 16, 2016 · 71 comments
Open

Allow to skip downloading manylinux wheels #3689

sashkab opened this issue May 16, 2016 · 71 comments
Labels
C: wheel The wheel format and 'pip wheel' command state: awaiting PR Feature discussed, PR is needed type: docs Documentation related

Comments

@sashkab
Copy link
Contributor

sashkab commented May 16, 2016

  • Pip version: 8.1.{0,1,2}
  • Python version: 2.7
  • Operating System: CentOS 6

Description:

When PIP 8.1 introduced support for manylinux1 wheels, few issues started to show up when attempting to build wheels. We use custom Python installation, installed in other than system Python location. When we upgrade our package requirements, we re-build wheels ourself using command below:

$ pip wheel -w /path/to/wheel -f /path/to/wheel --use-wheel  -r requirements.txt

Pre-PIP 8.1, this command did what I expected: build wheels for new packages in the requirements.txt file. After PIP 8.1, it just download manylinux wheels despite the fact that /path/to/wheel already has an wheel for the requirement.

Let's take an example: One of the packages we use is numpy. Requirements string looks like this: numpy==1.10.4. This ensures we use only this package version. What PIP pre-8.1 did: It detected that wheel for numpy 1.10.4 was already built and did nothing else. What PIP 8.1.x does: it downloads manylinux1 wheel, despite the fact that I already have wheel for 1.10.4 and wasn't updating numpy wheel.

You might suggest adding --no-binary numpy, but that won't solve my problem either -- I don't want to rebuild numpy package every time I build wheels, and I don't want to select only updated packages to build wheels. I like what I had before: -r requirements.txt and it did the job.

So what I'm asking here is either of two:

  1. If wheel already built -- just skip it, don't attempt to download manylinux wheel.
  2. Option to disable download of manylinux wheels.
@dstufft
Copy link
Member

dstufft commented May 16, 2016

You can drop a _manylinux.py file in site-packages or in the standard library or wherever that will make it importable with contents like:

manylinux1_compatible = False

Does that satisfy your use case?

@sashkab
Copy link
Contributor Author

sashkab commented May 16, 2016

Thank you for quick response. It might -- need to test this.

But I'd still prefer some kind of command line option (ie --no-manylinux1) for pip.

@Dr-Bean
Copy link

Dr-Bean commented May 18, 2016

Got bitten by this too. Creating a _manylinux.py file works, but imo, a command-line option or even an environment variable to exclude manylinux wheels would be a lot cleaner.

@tlandschoff-scale
Copy link

We got bitten by this too. Builds that used to work (on CentOS 5, CentOS 6) still seemed to work but the final PyInstaller build does not work.

@sashkab
Copy link
Contributor Author

sashkab commented Aug 2, 2016

@dstufft -- I'm wondering it you decided on implementing an option to skip manylinux1 wheels download? FOr some reason, I'm getting bitten by this over and over. Keeping _manylinux.py file in site-packages is nice, but I can't remember always add it... :(

@dholth
Copy link
Member

dholth commented Aug 2, 2016

There is an issue suggesting that the non-manylinux1 tag should have precedence over the manylinux1 tag. Would that solve your issue?

@asottile
Copy link
Contributor

asottile commented Aug 2, 2016

No it will not. I'd like a flag to ignore manylinux entirely (and download source distribution or other wheels).

@dholth
Copy link
Member

dholth commented Aug 2, 2016

Have you tried publishing a package that provides _manylinux.py and installing that into your virtual environments as a matter of course?

@asottile
Copy link
Contributor

asottile commented Aug 2, 2016

Yes, but I consider it a dirty hack (why should installing a package have a side-effect of changing how pip functions?).

It also doesn't work in all cases for example system pip where I do not control dist-packages.

@pfmoore
Copy link
Member

pfmoore commented Aug 2, 2016

If anyone is interested in writing a PR for this, that would probably help move it forward (a command line option would make the most sense, as that would automatically support setting it in the ini file or via an environment variable).

Otherwise, supplying _manylinux.py is probably the best option for now. (You could set PYTHONPATH to a directory of your choosing, and add _manylinux.py there, that would make it visible in all environments)

@asottile
Copy link
Contributor

asottile commented Aug 3, 2016

I started working on a patch, does this seem sane before I start figuring out how to test this?

https://github.com/pypa/pip/compare/master...asottile:no_manylinux?expand=1

For example:

Default

$ pip download libsass --dest foo
Collecting libsass
  Using cached libsass-0.11.1-cp27-cp27mu-manylinux1_x86_64.whl
  Saved ./foo/libsass-0.11.1-cp27-cp27mu-manylinux1_x86_64.whl
Collecting six (from libsass)
  Using cached six-1.10.0-py2.py3-none-any.whl
  Saved ./foo/six-1.10.0-py2.py3-none-any.whl
Successfully downloaded libsass six
$ ls foo/
libsass-0.11.1-cp27-cp27mu-manylinux1_x86_64.whl
six-1.10.0-py2.py3-none-any.whl

With --no-manylinux

$ pip download libsass --dest foo --no-manylinux
Collecting libsass
  Using cached libsass-0.11.1.tar.gz
  Saved ./foo/libsass-0.11.1.tar.gz
Collecting six (from libsass)
  Using cached six-1.10.0-py2.py3-none-any.whl
  Saved ./foo/six-1.10.0-py2.py3-none-any.whl
Successfully downloaded libsass six
$ ls foo/
libsass-0.11.1.tar.gz  six-1.10.0-py2.py3-none-any.whl

@sashkab
Copy link
Contributor Author

sashkab commented Aug 4, 2016

@asottile -- why don't you submit pull request so somebody could review it and provide comments?

@asottile
Copy link
Contributor

asottile commented Aug 4, 2016

Sure, felt like I should get a first round of feedback on the initial approach since it is untested currently but I can do that

@asottile
Copy link
Contributor

asottile commented Aug 4, 2016

#3892

@jayvdb
Copy link

jayvdb commented Aug 4, 2016

Could this be a problem wIth the new manylinux download being run unnecessarily?
If you have an existing wheel locally, pip doesnt need to fetch a new wheel. i.e. by default it shouldnt download a manylinux wheel if there is already a non-manylinux wheel available locally.

Then if someone really wants pip to not use a locally available wheel, they shouldnt put it in a place that pip is looking for local wheels.

@sashkab
Copy link
Contributor Author

sashkab commented Aug 4, 2016

Could this be a problem wIth the new manylinux download being run unnecessarily?

This too. But most importantly, I want an option in pip which will disable manylinux completely. Currently, I need to check if manylinux wheel somehow got downloaded and kill it in the wheelhouse. This is very annoying at times.

@asottile
Copy link
Contributor

asottile commented Aug 4, 2016

Our main usecase is we do a one-time download / build of wheels to put in our internal pypi server and manylinux wheels are very much incompatible with our security requirements.

@jayvdb
Copy link

jayvdb commented Aug 4, 2016

Any solution which focuses on manylinux will be linux specific.

manylinux wheels are very much incompatible with our security requirements.

If I understand correctly, you want pip to not download binary from a foreign repo, but you are happy with a binary being built and used locally.

But there could be a Windows shop which has the same security requirements, and a manylinux solution wont work for them.

In which case you want to be able to disable binary for foreign repo, and then the Windows shop will also be able to use the solution.

@asottile
Copy link
Contributor

asottile commented Aug 4, 2016

It's not that it's binary from a foreign repo, it's that shared object files of libraries (that often have security fixes such as libxml, libssl, etc.) are straight vendored into the wheel.

The windows shop is probably already ok with --no-binary ':all:' which'll avoid the win32 / win_amd64 wheels?

@jayvdb
Copy link

jayvdb commented Aug 4, 2016

it's that shared object files of libraries (that often have security fixes such as libxml, libssl, etc.) are straight vendored into the wheel.

Can you give an example?

Maybe there is a generic way to improve pip such that it excludes/rejects those prebuilt wheels, only when they include undesirable contents, which could also occur on Windows.

@sashkab
Copy link
Contributor Author

sashkab commented Aug 4, 2016

@jayvdb example is numpy -- it comes as manylinux wheel, with builtin libraries which aren't compatible with the system and caused me couple hours of headache when I accidentally downloaded manylinux wheel and used it to install numpy. So I'd rather have --no-manylinux flag for pip {download,install,wheel}, rather then I need to waste time later to figure out why something suddenly doesn't work the way it should.

@jayvdb
Copy link

jayvdb commented Aug 5, 2016

numpy/numpy#7570 appears to be the only open issue related to manylinux. It does confirm they are shipping .so's and causing many problems in the process. :/

@xavfernandez
Copy link
Member

I'd rather not have a "manylinux" specific option.

Maybe a more general option --only-pure-python or something akin to --no-binary or --only-binary...

@asottile
Copy link
Contributor

asottile commented Aug 6, 2016

Manylinux is already a special case in pip. I also still want to be able to download normal binary wheels (such as from an internal pypi server).

@pfmoore
Copy link
Member

pfmoore commented Aug 6, 2016

Manylinux is already a special case in pip.

Is manylinux not simply a specific compatibility tag? (I'm not that familiar with how manylinux is implemented). I would assume that Linux platforms state that they support a set of compatibility tags that includes manylinux, but that they prefer platform-specific binaries over manylinux. In which case, the more general option would be to have something that allows users to remove tags from the list of supported compatibility tags.

In any case, I agree with @xavfernandez that we should prefer general solutions over special cases. The compatibility tag mechanism handles this (or should, it's what it was designed for) so I'd prefer manylinux to work within that framework (and then this issue becomes "we need a way to override the default platform compatibility list").

@asottile
Copy link
Contributor

asottile commented Aug 6, 2016

Here's where pip special cases manylinux:

arches = [arch.replace('linux', 'manylinux1'), arch]

@pfmoore
Copy link
Member

pfmoore commented Aug 6, 2016

OK. I wonder why it replaces linux with manylinux, rather than just adding a lower-priority manylinux.

@xavfernandez
Copy link
Member

@pfmoore it adds a manylinux flavor to the arch in addition to the supported vanilla arch.

We could maybe piggyback on #3760 and allow --platform option for pip install.

@asottile
Copy link
Contributor

The biggest reason being it's not possible to avoid them in a single install command, the other reason is it's yet another thing to update when new manylinux standards appear (apparently it's been years since 2010 but hadn't been updated until today for instance). The other is it's not really a system per se, it's more akin to the --no-binary case (I could totally imagine a --no-binary :manylinux: or something which exactly solves this issue)

@pradyunsg
Copy link
Member

apparently it's been years since 2010

Oh, the number is reflecting the date of the oldest systems that is compatible with that version. I don't have he dates for manylinux2010 off hand but manylinux2014 became an approved PEP last month. 🙃

@pradyunsg pradyunsg added state: awaiting PR Feature discussed, PR is needed and removed S: awaiting response Waiting for a response/more information labels Aug 12, 2019
@pradyunsg
Copy link
Member

pradyunsg commented Aug 12, 2019

Alrighty. Let's add an option to pip to handle this on a per-install basis. I honestly don't think this is high priority for me but we'll be happy to accept a PR for this (subject to the regular PR reviews).

@pfmoore
Copy link
Member

pfmoore commented Aug 12, 2019

I'm still not convinced we need a special case option in pip. We already have a plethora of options for special-case tweaking of what gets installed, and the maintenance overhead is non-trivial. While I don't particularly think that a "no manylinux" option will add significantly to that burden, it is nonetheless another step down that slope.

I'm not going to block a PR for this, but I want to strongly advise caution when considering it. How many users will it benefit? How often will it be the only possible solution for such users? How does that measure against the added technical debt (and user confusion cost) that this incurs?

@bsolomon1124
Copy link

I know this is a long conversation already, but one thing that has not been pointed out is that, from a security perspective, this seems not to be a manylinux-specific thing at all.

As mentioned above:

It's not that it's binary from a foreign repo, it's that shared object files of libraries (that often have security fixes such as libxml, libssl, etc.) are straight vendored into the wheel.

This vendoring can occur not just via auditwheel (Linux) but also via delocate on macOS. The concept is the same; a third-party, not-whitelisted library is being bundled into the wheel. So a --no-manylinux tag is probably not holistic enough in that sense. If you wanted something besides --no-binary, it would probably need to account for the more intricate logic of disallowing wheels that contain bundled libraries specifically.

@pradyunsg pradyunsg added resolution: no action When the resolution is to not do anything and removed state: awaiting PR Feature discussed, PR is needed labels May 26, 2020
@pradyunsg
Copy link
Member

With PEP 600 and pip moving to use packaging.tags (a common shared implementation for generating compatibility tags), pip's codebase no longer directly controls or generates the compatibility tags. Further, with PEP 600, there's now no need for any updates to no-manylinux, once support for disabling PEP 600-style manylinux wheels is added to it.

As it is already possible to disable this behavior, via pip install no-manylinux from PyPI, and given that the overhead of keeping that package functioning is very low now, I no longer think that we should do this.

@asottile
Copy link
Contributor

the no-manylinux package still feels like a huge hack :S -- but I did update it a few months ago for PEP 600 asottile-archive/no-manylinux@5e5dea7

@tobyp
Copy link

tobyp commented Oct 2, 2020

For what it's worth, there's a usecase here that isn't easily covered by the no-manylinux package approach: If the system on which the wheels are built/collected supports manylinux, but they're being collected to build a container/image for a system that won't.

For example, packaging up a package and its dependencies with pip wheel -w dist mypackage, and then a Dockerfile like this:

FROM python:3.8-alpine

COPY dist /wheels
RUN pip install --no-index --find-links /wheels mypackage; rm -rf /wheels

Alpine does not support manylinux packages (docker build for the above will overlook any manylinux dependencies and say ERROR: Could not find a version that satisfies the requirement at the RUN step), but the system on which pip wheel runs does, and should not generally avoid them.

Essentially, this is similar to cross-compiling, so would it be acceptable to allow constraining which packages pip is allowed to use in a wheel run? This constraint set could be "exported" from another installation's pip.pep425tags.get_supported. If I'm not mistaken, this might actually make it a more generic version of --no-binary, since that just maps to only allowing any?

@asottile
Copy link
Contributor

asottile commented Oct 2, 2020

@tobyp I believe that's a related but entirely separate problem -- and in general building wheels on a disparate platform is not going to work. for example, if you produce a wheel which links against libc (even if it isn't downloading this wheel which is what this issue is about)

@tobyp
Copy link

tobyp commented Oct 6, 2020

@asottile Thanks for the reply, and explaining the difference between this issue and my use case! I've managed to solve my problem using multi-stage docker builds that produce dist already inside a container, which sidesteps the libc-related problems you mention.

For what it's worth, if it's only about preventing the installation of manylinux packages, I can't think of any case that wouldn't be covered by the no-manylinux package either.

@ccoulombe
Copy link

With the latest version of pip (22.2), how one would now disable any manylinux wheels?

@asottile
Copy link
Contributor

@ccoulombe same as above, install no-manylinux first before installing

@ccoulombe
Copy link

ccoulombe commented Jul 21, 2022

Thanks @asottile
I asked since this seems to not work with PEP 600

(492) ~ $ pip install -U pip
(492) ~ $ pip --version
pip 22.2 from /home/coulombc/.envs/492/lib/python3.8/site-packages/pip (python 3.8)
(492) ~ $ pip install no-manylinux
(492) ~ $ pip install orjson-3.7.8-cp38-cp38-manylinux_2_28_x86_64.whl 
Processing ./orjson-3.7.8-cp38-cp38-manylinux_2_28_x86_64.whl
Installing collected packages: orjson
Successfully installed orjson-3.7.8
(492) ~ $ 

@asottile
Copy link
Contributor

works fine for me:

$ pip install orjson-3.7.8-cp38-cp38-manylinux_2_28_x86_64.whl 
ERROR: orjson-3.7.8-cp38-cp38-manylinux_2_28_x86_64.whl is not a supported wheel on this platform.

@ccoulombe
Copy link

Right! On my personal computer, it works as expected.
But on systems with a _manylinux.py :

manylinux1_compatible = False
manylinux2010_compatible = False
manylinux2014_compatible = False
manylinux_2_24_compatible = False
manylinux_2_28_compatible = False
manylinux_2_31_compatible = False

I'm still able to install the orjson manylinux wheel. With and without the no-manylinux.

If I add the manylinux_compatible function to the _manylinux.py, it works as expected without the no-manylinux package.

The system _manylinux.py had priority over the _manylinux.py installed by the no-manylinux package.

Thanks @asottile!
Solution: add the manylinux_compatible to the system _manylinux.py

@asottile
Copy link
Contributor

yeah your _manylinux.py file is wrong

@uranusjr
Copy link
Member

What is the action item for pip here? If there’s not one, should we close this issue?

@ccoulombe
Copy link

As user perspective, document how to achieve this? This being, ignore manylinux wheels by using a system/site _manylinux.py

@pradyunsg pradyunsg added type: docs Documentation related state: awaiting PR Feature discussed, PR is needed and removed type: enhancement Improvements to functionality resolution: no action When the resolution is to not do anything labels Mar 14, 2023
@pradyunsg
Copy link
Member

Retagged to reflect that documentation for this is the right thing to do here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C: wheel The wheel format and 'pip wheel' command state: awaiting PR Feature discussed, PR is needed type: docs Documentation related
Projects
None yet
Development

No branches or pull requests