Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add --exclude option to auditwheel repair #368

Merged
merged 5 commits into from
Oct 22, 2022

Conversation

martinRenou
Copy link

@martinRenou martinRenou commented Feb 5, 2022

This is a follow-up of #310 by @rossant, only keeping the exclude option as suggested by @mayeut

Closes #76
Closes #241
Closes #310
Fixes #391

Original PR description:


This is a quick fix to solve an issue I have with my project and for which auditwheel seemed to help. I don't know if this approach is suitable to other users of auditwheel.

I develop a C graphics library (datoviz) that depends on Vulkan and comes with Cython bindings. datoviz/datoviz#13 for Linux (at least Ubuntu 20.04, that would be a start).

  1. As a first step, I compile the C project and get a libdatoviz.so shared library.
  2. Then, I build the Cython library, which has a dynamic dependency to libdatoviz.
  3. I'd like to bundle both the Cython extension module, and libdatoviz, in the same wheel.
  4. I tried to use auditwheel repair. It included dozens of other dependencies in the wheel, including libvulkan and many graphics-related dependent libraries. The compiled wheel installs properly, but it doesn't work properly (there are issues related to windowing, GPU access, etc). Related: https://github.com/pypa/auditwheel/issues/241[](https://github.com/rossant)
  5. I made some changes to auditwheel (this pull request) to exclude libvulkan and a few other libraries. Some issues were fixed, but not others.
  6. As another approach, I tried to bundle just libdatoviz, and no other library in the wheel. That seems to work, at least on my computer. Chances are that the wheel may not work on other Linux-based operating systems though.
    Any help would be appreciated, regarding either this pull request, and/or a way to solve the issue I'm facing. Thanks!

@martinRenou martinRenou changed the title Add --exclude option Add --exclude option to auditwheel repair Feb 5, 2022
@codecov
Copy link

codecov bot commented Feb 5, 2022

Codecov Report

Base: 92.42% // Head: 92.45% // Increases project coverage by +0.02% 🎉

Coverage data is based on head (83ae412) compared to base (73c9f17).
Patch coverage: 100.00% of modified lines in pull request are covered.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #368      +/-   ##
==========================================
+ Coverage   92.42%   92.45%   +0.02%     
==========================================
  Files          23       23              
  Lines        1268     1272       +4     
  Branches      311      312       +1     
==========================================
+ Hits         1172     1176       +4     
  Misses         55       55              
  Partials       41       41              
Impacted Files Coverage Δ
src/auditwheel/main_repair.py 90.76% <100.00%> (+0.14%) ⬆️
src/auditwheel/repair.py 86.61% <100.00%> (+0.32%) ⬆️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

src/auditwheel/repair.py Outdated Show resolved Hide resolved
src/auditwheel/repair.py Outdated Show resolved Hide resolved
Copy link
Member

@mayeut mayeut left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your contribution,
I left a couple remarks inlined.
The PR is failing CI:

  • we do not have pre-commit.ci setup yet so please make sure to run nox -s lint in order to check/repair for linter issues. if you don't know it, nox is a command-line tool that automates testing in multiple Python environments.
  • codecov is complaining of a drop in coverage because there's no test associated with this new feature. Can you please add some tests for this ?

@martinRenou
Copy link
Author

Thanks for reviewing! I'll try to find the time to add tests and fix linting

src/auditwheel/repair.py Outdated Show resolved Hide resolved
@martinRenou martinRenou force-pushed the exclude_so_files branch 2 times, most recently from 27939cf to 82ef32c Compare March 8, 2022 14:11
@leofang
Copy link

leofang commented Mar 25, 2022

I think this is a nice addition. Any chance we can get this in sooner? 🙂

auvipy
auvipy previously requested changes Apr 4, 2022
Copy link
Contributor

@auvipy auvipy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please increase the test coverage

@lkollar lkollar force-pushed the exclude_so_files branch 2 times, most recently from 26c1bb8 to ebd5098 Compare July 18, 2022 14:13
@lkollar
Copy link
Contributor

lkollar commented Jul 18, 2022

@martinRenou I made a few changes to the implementation, mainly to simplify it and added tests for it. I hope you don't mind that I pushed the changes directly into your branch.

rossant and others added 2 commits July 18, 2022 16:02
This allows excluding libraries from the resulting wheels, like OpenGL
or Vulkan, which are provided by the OS.
We have fairly complex code in a test to build the numpy wheels. We can
extract this and reuse it in other tests in the future which need a
cleanly build wheel before testing `auditwheel repair` functionality.
Build numpy from source and when repairing, exclude the gfortran
library. This is only tested in a single manylinux image to avoid
increasing the test time too much, as the functionality is not dependent
on the platform.
@martinRenou
Copy link
Author

I don't mind at all. Thanks a lot!

@lkollar
Copy link
Contributor

lkollar commented Aug 16, 2022

I haven't merged this yet, as I'm still unsure if it is a good idea to exclude arbitrary libraries. On one hand, it solves a real issue, and I don't see how else we could achieve this, but it also ends up violating the manylinux spec. There is a risk that this will cause confusion and bad user experience, and it makes it more likely that users will run into ABI incompatibilities due to version mismatches between the extension and its excluded dependencies. Another concern is that if PyPI wanted to block uploads of non-conforming wheels, this feature would make it more difficult to implement it (see pypi/warehouse#5420).

I wonder if this needs a larger discussion with the PyPA community, since the scope is bigger here than just auditwheel. If we want to properly support wheels which require some special external libraries, that could require changes in tooling in other areas as well. I don't have the bandwidth to take this on, but if someone wanted to raise this on the Packaging Discourse, that would be a good start.

Any thoughts @mayeut?

@mattip
Copy link

mattip commented Aug 17, 2022

I think this would allow abusing the purpose of auditwheel. I commented on the original issue about not pulling libOpenGL shared objects into the wheels.

@jakirkham
Copy link
Contributor

I wonder if this needs a larger discussion with the PyPA community, since the scope is bigger here than just auditwheel. If we want to properly support wheels which require some special external libraries, that could require changes in tooling in other areas as well. I don't have the bandwidth to take this on, but if someone wanted to raise this on the Packaging Discourse, that would be a good start.

There was PEP 668, which AIUI was also trying to help this use case of having some libraries externally managed. This is the result of a few discussions in different threads over the years (including past PyCons):

@mayeut
Copy link
Member

mayeut commented Oct 8, 2022

I commented on the original issue about not pulling libOpenGL shared objects into the wheels.

This would require a new ABI compatibility analysis to be done to ensure whitelisting is allowed per pep-600

I think this would allow abusing the purpose of auditwheel.

Yes, it would allow abusing the purpose but allowing package maintainers to use this (responsibly) is a way not to have auditwheel maintainers on the critical path of many projects. As can be seen by the activity in the repo, I do not have that much time to spend on this project, I'm not sure about @lkollar, but I guess that's about the same.

see also my other comment in #310 (comment)

There are already many packages on PyPI abusing PEP600, this is no reason to do so but, IMHO, having an escape hatch is always a good thing to have. It's probably better to have this escape hatch and have the rest of auditwheel run its process than to have a bunch of hand-tagged manylinux1 wheels on PyPI.

One example of abuse (or let's say at least gray area) are GPU dependent packages. Those raised some discussion but I'm not sure there's a real conclusion yet (& the discussion starter was package size rather than pep600 compliance).
For example, one of torch wheels is torch-1.12.1-cp310-cp310-manylinux1_x86_64.whl. At least one of the shared object in that wheel links against libcuda.so.1 which is part of the GPU driver and should not be grafted (nor whitelisted per strict PEP600) thus can't be repaired by auditwheel today (& "this led" to an invalid manylinux1 tag - might be another reason entirely but sure would have been a blocker).

There is a risk that this will cause confusion and bad user experience, and it makes it more likely that users will run into ABI incompatibilities due to version mismatches between the extension and its excluded dependencies.

Agreed but my comment above applies: I think the pros outweigh the cons.

Another concern is that if PyPI wanted to block uploads of non-conforming wheels, this feature would make it more difficult to implement it (see pypi/warehouse#5420).

I think that if PyPI is to implement that feature, they should not block on missing shared objects. They can't know for sure they're not compliant unless doing a manual review. We can't even know for sure that a manylinux1 tagged wheel that embeds some manylinux2014+ only symbol versions is not in fact manylinux1 compliant although in 99.9% of cases, we'd be right to assume it should be tagged manylinux2014.

I wonder if this needs a larger discussion with the PyPA community, since the scope is bigger here than just auditwheel. If we want to properly support wheels which require some special external libraries, that could require changes in tooling in other areas as well. I don't have the bandwidth to take this on, but if someone wanted to raise this on the Packaging Discourse, that would be a good start.

While I personally think this should be merged, some discussions could be raised but I don't have the bandwidth to take this on either.

EDIT: @rgommers is probably right about a large community-wide discussion 2 comments below this one.

@leofang
Copy link

leofang commented Oct 10, 2022

One example of abuse (or let's say at least gray area) are GPU dependent packages. Those raised some discussion but I'm not sure there's a real conclusion yet (& the discussion starter was package size rather than pep600 compliance).

Speaking from my personal experience, the whitelisting discussion here is intimately related to the package size discussion. They are the different facets of the same problem.

For GPU packages (or any packages that rely on vender-provided, proprietary prebuilt binaries) without whitelisting they only have the following options:

  1. dlopen
  2. static linking or auditwheel repair

Option 1 is technically still in violation of the auditwheel spirit, because it's still dynamic linking and could allow a non-self-contained wheel. It also requires a careful runtime check to ensure the correct shared library is dlopen'd.

Option 2, on the other hand, leads to significantly increased package sizes and, in particular for proprietary packages, potential violation of the vendor's EULA (some vendors put very strict redistribution limitation).

Finally, to adopt either option requires the downstream package providers to invest significant effort.

I am +1 for supporting whitelisting or any kind of escape hatch. The only (minor) disagreement I have with @mayeut's above comment is I see no cons, only pros 🙂

@rgommers
Copy link

This has now been proposed independently multiple times:

So that's a good fraction of the most popular scientific computing, data science and ML/AI libraries - plus some visualization mixed in for good measure.

There are already many packages on PyPI abusing PEP600, this is no reason to do so but, IMHO, having an escape hatch is always a good thing to have.

I would agree with the latter part. For the former I'd say: this is not abuse, there are good technical reasons for wanting to exclude a shared library. At least four separate ones have been given:

  • for libOpenGL it was related to drivers, the shared library must come from the system rather than from a wheel to work at all
  • for PyArrow and GPU packages it's package size. Unnecessary copying of .so's that are hunderds of MB (sometimes even ~1 GB as for libtorch) is very bad. Those .so's are already present in other wheels, and can be reused.
  • for OpenBLAS and OpenMP it's needing to have a single copy of the library, rather than two or more (and with multiple name-mangled copied you do end up with multiple copies) to avoid performance or functionality issues
  • vendors with a EULA that prohibits redistribution (not just CUDA, I think MKL is also in that category)

I am all for discouraging stuffing native dependencies in wheels when it is not needed, because it doesn't fit well with the design of PyPI and Python packaging. There's lots of ways to shoot ourselves in the foot, and the author-led model of uploading new releases of versions one by one, built without a common infrastructure, will make it hard to avoid problems or correct them later.

That said, it is very clear that there is a need. So there should be an escape hatch. The users of that escape hatch are responsible for ensuring that they do the preloading correctly in order to not end up with end users with broken systems. But in the end it's also not that hard to get it right, once the correct pattern is established (which is checking the location of the Python package providing the shared library first, then preloading it - and not simply using RPATH, as discussed in several of the other issues).

There is a risk that this will cause confusion and bad user experience, and it makes it more likely that users will run into ABI incompatibilities due to version mismatches between the extension and its excluded dependencies.

Agreed but my comment above applies: I think the pros outweigh the cons.

I agree with @mayeut here. This is a little nontrivial, but it's a dependency management problem. The limited set of projects that really need this do have maintainers that have some understanding of the issues. And I'm sure it'll go wrong once - but that's why there's a "yank" button on PyPI.

I wonder if this needs a larger discussion with the PyPA community, since the scope is bigger here than just auditwheel. If we want to properly support wheels which require some special external libraries, that could require changes in tooling in other areas as well.

Unless there are very concrete issues for the projects listed above that actually want to use this (did you have any in mind @lkollar?), it'd be great to just move this forward. In the end this is a ~10 line patch + tests, and if there really is a huge showstopper that turns up later - because there is none presented so far - then it can be reconsidered or reverted at that point. Requiring someone to start a large community-wide discussion, which may then veer into "please write a PEP" or "please consider all of the original pynativelib proposal" does not seem like a fair ask. Nor a necessary one. Not even desirable probably, since we don't want to encourage everyone to start doing this - only when one really needs it, and there's no other way out.

@mayeut
Copy link
Member

mayeut commented Oct 11, 2022

Thanks for your feedback @rgommers.

I edited my comment about a community-wide discussion. I think you're absolutely right on that one. I guess @mattip changed his view about not wanting this PR to be merged.
I will let @lkollar some time to answer your question though.

@mattip
Copy link

mattip commented Oct 11, 2022

Yes, I changed my view as I learned more about the pain points mentioned above: practicality sometimes beats purity.

Copy link
Member

@mayeut mayeut left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll let another week to @lkollar to comment. I'll merge this next week otherwise.

@XuehaiPan
Copy link

XuehaiPan commented Oct 17, 2022

I think this PR also requires an option to update the rpath as:

did. I.e., add the relative path of the skipped libraries in rpath. Otherwise, the wheel may be broken for missing libraries.

My first thought is to change the value of --exclude colon separate value as:

auditwheel repair --exclude <soname1>:<rpath-entry1> <soname2>:<rpath-entry2> <soname3> xxx.whl

if the value ends with :<rpath-entry> then update the rpath. Otherwise, skip without any change.

For example, my own use case (exclude libtorch*.so from PyTorch):

auditwheel repair --exclude \
    'libtorch_python.so:$ORIGIN/../torch/lib' \
    ...
    # very long list of sonames
    ...
    package.whl

EDIT: Even further, I think it would be better to support glob patterns rather than specify the sonames one by one. Because the dependency libraries grow over time while new .so files can be added in new releases. The developer will need to change the auditwheel command options when its dependency releases a new version.

For example, torch provides different wheels for different user-side runtimes (e.g. cpu. cu102, cu113). And the corresponding wheels ship different .so files (e.g. cpu build does not have CUDA-related libraries). The package maintainer will need to download all version and all builds of the dependency wheels and takes a union set of the present external .so files then add them to the auditwheel repair --exclude command line.

$ /usr/bin/ldd _C.cpython-38-x86_64-linux-gnu.so
        linux-vdso.so.1 (0x00007fff6d2f3000)
        libtorch_python.so => /home/PanXuehai/Miniconda3/envs/torchopt/lib/python3.8/site-packages/torch/lib/libtorch_python.so (0x00007f59db972000)
        libgomp.so.1 => /home/PanXuehai/Miniconda3/envs/torchopt/lib/libgomp.so.1 (0x00007f59db939000)
        librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007f59db922000)
        libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f59db8ff000)
        libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f59db8f9000)
        libstdc++.so.6 => /home/PanXuehai/Miniconda3/envs/torchopt/lib/libstdc++.so.6 (0x00007f59db743000)
        libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f59db5f4000)
        libgcc_s.so.1 => /home/PanXuehai/Miniconda3/envs/torchopt/lib/libgcc_s.so.1 (0x00007f59db5db000)
        libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f59db3e9000)
        /lib64/ld-linux-x86-64.so.2 (0x00007f59dc82e000)
        libshm.so => /home/PanXuehai/Miniconda3/envs/torchopt/lib/python3.8/site-packages/torch/lib/libshm.so (0x00007f59db3df000)
        libtorch.so => /home/PanXuehai/Miniconda3/envs/torchopt/lib/python3.8/site-packages/torch/lib/libtorch.so (0x00007f59db3d8000)
        libtorch_cuda.so => /home/PanXuehai/Miniconda3/envs/torchopt/lib/python3.8/site-packages/torch/lib/libtorch_cuda.so (0x00007f59db3b9000)
        libtorch_cuda_cpp.so => /home/PanXuehai/Miniconda3/envs/torchopt/lib/python3.8/site-packages/torch/lib/libtorch_cuda_cpp.so (0x00007f59d5786000)
        libnvToolsExt.so.1 => /home/PanXuehai/Miniconda3/envs/torchopt/lib/python3.8/site-packages/torch/lib/../../../../libnvToolsExt.so.1 (0x00007f59d557b000)
        libtorch_cpu.so => /home/PanXuehai/Miniconda3/envs/torchopt/lib/python3.8/site-packages/torch/lib/libtorch_cpu.so (0x00007f59c8e47000)
        libc10_cuda.so => /home/PanXuehai/Miniconda3/envs/torchopt/lib/python3.8/site-packages/torch/lib/libc10_cuda.so (0x00007f59c8df1000)
        libcudart.so.11.0 => /home/PanXuehai/Miniconda3/envs/torchopt/lib/python3.8/site-packages/torch/lib/../../../../libcudart.so.11.0 (0x00007f59c8b49000)
        libcudnn.so.8 => /home/PanXuehai/Miniconda3/envs/torchopt/lib/python3.8/site-packages/torch/lib/libcudnn.so.8 (0x00007f59c891e000)
        libtorch_cuda_cu.so => /home/PanXuehai/Miniconda3/envs/torchopt/lib/python3.8/site-packages/torch/lib/libtorch_cuda_cu.so (0x00007f599b37d000)
        libc10.so => /home/PanXuehai/Miniconda3/envs/torchopt/lib/python3.8/site-packages/torch/lib/libc10.so (0x00007f599b2e0000)
        libcusparse.so.11 => /home/PanXuehai/Miniconda3/envs/torchopt/lib/python3.8/site-packages/torch/lib/../../../../libcusparse.so.11 (0x00007f598d14b000)
        libcurand.so.10 => /home/PanXuehai/Miniconda3/envs/torchopt/lib/python3.8/site-packages/torch/lib/../../../../libcurand.so.10 (0x00007f598755a000)
        libcufft.so.10 => /home/PanXuehai/Miniconda3/envs/torchopt/lib/python3.8/site-packages/torch/lib/../../../../libcufft.so.10 (0x00007f597eacb000)
        libmkl_intel_lp64.so => /home/PanXuehai/Miniconda3/envs/torchopt/lib/python3.8/site-packages/torch/lib/../../../../libmkl_intel_lp64.so (0x00007f597df2c000)
        libmkl_gnu_thread.so => /home/PanXuehai/Miniconda3/envs/torchopt/lib/python3.8/site-packages/torch/lib/../../../../libmkl_gnu_thread.so (0x00007f597c3a1000)
        libmkl_core.so => /home/PanXuehai/Miniconda3/envs/torchopt/lib/python3.8/site-packages/torch/lib/../../../../libmkl_core.so (0x00007f5977f31000)
        libcupti-ea0c9f68.so.11.6 => /home/PanXuehai/Miniconda3/envs/torchopt/lib/python3.8/site-packages/torch/lib/libcupti-ea0c9f68.so.11.6 (0x00007f5977682000)
        libcublas.so.11 => /home/PanXuehai/Miniconda3/envs/torchopt/lib/python3.8/site-packages/torch/lib/../../../../libcublas.so.11 (0x00007f596dc0a000)
        libutil.so.1 => /lib/x86_64-linux-gnu/libutil.so.1 (0x00007f596dc05000)
        libcublasLt.so.11 => /home/PanXuehai/Miniconda3/envs/torchopt/lib/python3.8/site-packages/torch/lib/../../../../libcublasLt.so.11 (0x00007f5958676000)

I think it would be nice to have:

auditwheel repair --exclude 'torch/lib/*.so:$ORIGIN/../torch/lib' package.whl

@rgommers
Copy link

I.e., add the relative path of the skipped libraries in rpath. Otherwise, the wheel may be broken for missing libraries.

No, please do not do that, it is not going to work. See #391 (comment) for why, and further down in that discussion for a more robust solution: #391 (comment).

@leofang
Copy link

leofang commented Oct 17, 2022

The rpath hack does not work for system dependencies (OpenGL, CUDA, Vulkan, ...), which is a wide realm that this PR aims to cover, so please do not do that.

@XuehaiPan
Copy link

I.e., add the relative path of the skipped libraries in rpath. Otherwise, the wheel may be broken for missing libraries.

No, please do not do that, it is not going to work. See #391 (comment) for why, and further down in that discussion for a more robust solution: #391 (comment).

The rpath hack does not work for system dependencies (OpenGL, CUDA, Vulkan, ...), which is a wide realm that this PR aims to cover, so please do not do that.

@rgommers @leofang Thanks for the clarification. It's reasonable to skip only for the system dependencies. Waiting for a more robust approach for excluding third-party libraries. Adding a wrapping layer looks better than rpath hacking. I will try my best to investigate that.

@mayeut mayeut merged commit 47c9399 into pypa:main Oct 22, 2022
@mayeut
Copy link
Member

mayeut commented Oct 22, 2022

Let's get this merged.
Thanks @martinRenou for the PR.
Thanks to all for your inputs.

@martinRenou martinRenou deleted the exclude_so_files branch October 24, 2022 07:08
@lkollar
Copy link
Contributor

lkollar commented Oct 24, 2022

Sorry for not answering earlier @mayeut. I was going to agree with merging this though. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet