-
Notifications
You must be signed in to change notification settings - Fork 283
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Discussion: support bundling libmagic #233
Comments
Thanks for starting this discussion. I think a nice way to handle this is publishing a separate package that exposes the shared lib/data file with package_data, and then make that an optional dependency of python-magic with extra_require. The extra_require is nice because we can bump the versions together, though not strictly necessary. What do you think? |
This is more or less what we do now:
And I have a build loop otherwise in https://github.com/nexB/scancode-plugins/blob/develop/etc/scripts/fetch-plugins.sh to do the actual footwork of assembling pre-built binaries for all OSes. So in recap, we can adapt, steal, reuse or not any of the code above for make benefit of the great libmagic! |
In the meantime as this discussion didn't seem to go anywhere -- I went and created a https://pypi.org/project/pylibmagic/ You just need to install and import this before importing |
Merging into #293 |
Is this merging appropriate? The merged issue is specifically about Windows, but this issue is not OS-specific. |
@kratsg Basically 100% of the issues with libmagic are on Windows, so my intent was to just solve it there. OSX and linux all have good solutions for this. Of course in principle once this is setup for windows other platforms are straightforward but given Python doesn't have awesome tooling for building+shipping binaries I'd rather keep it limited. |
@ahupp that's fair. I've solved it for MacOSX and Linux via https://github.com/kratsg/pylibmagic/ right now. The solution there is that it ships a pre-built binary of import pylibmagic
import magic will. It does require some monkeypatching of utilities that |
the whole idea of python-magic uploading binary (wheel) distribution would be to package the libmagic binary into the wheel (zip). from the packaging point of view, there should:
|
It's certainly possible to ship the binaries for every platform, but it's
not obvious a good cost/benefit tradeoff. It does have the nice benefit
of avoiding version skew between the Python and native parts. Do you feel
like using outside packages from Debian, homebrew etc is a problem?
…On Mon, Aug 28, 2023, 12:14 PM ddelange ***@***.***> wrote:
the whole idea of python-magic uploading binary (wheel) distribution would
be to package the libmagic binary into the wheel (zip).
from the packaging point of view, there should:
- only only host a source distribution (which will fail to install if
libmagic is not on available the system)
- upload linux/win/mac platform dependent wheels that include
libmagic, e.g. using cibuildwheel
<https://github.com/pypa/cibuildwheel> in github actions with a
platform-aware CIBW_BEFORE_ALL=./install_libmagic.sh
<https://cibuildwheel.readthedocs.io/en/stable/options/#before-all>
full example
<https://github.com/MagicStack/asyncpg/blob/v0.28.0/.github/workflows/release.yml#L73-L130>
—
Reply to this email directly, view it on GitHub
<#233 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAAJ5ERBG6XHYCQ7ICMY3C3XXTUZNANCNFSM4WM3FKTA>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
I only know of one other package that serves binary distributions (.whl), and still requires the user to additionally install an external binary (which will be dynamically linked / searched for at runtime): https://pypi.org/project/mxnet/ But that's only because of licensing of that one binary, which they would otherwise include in the binary distribution. Wheels are officially only allowed to dynamically link against glibc on the system, anything else needs to be included in the wheel. Generally, you would:
So to answer your question:
|
@ddelange I did a random sample of some top-250 packages that are (afaik) source-only and they all distribute a .-py3-none-any.whl: https://pypi.org/project/typing-extensions/#files I thought wheel files were used because they are the product of any "build" step (setup.py etc) so don't need to execute any code to install? |
Any project that needs compiled binaries (cythonized, rust binaries, c++ backend etc), will publish wheels for a wealth of combinations of python version (minor version specific ABI), OS and CPU architecture, containing pre-compiled binaries that will execute on the target system. See for instance this list of popular python libraries.
That is correct, when wheels are available on PyPI, pip does not need to execute So in case of libmagic, strictly speaking you should host an sdist on PyPI, which will detect a missing libmagic on install time by assertion in setup.py (or by copy attempt). If you choose to additionally host bdist (wheels) on PyPI, they should be self-contained, system specific wheels containing precompiled libmagic binaries. Does that make sense? |
That makes sense, thanks for the explanation.
But, surely this isn't the only package that has an external dependency on
some installed library though, there are plenty of cases where you
prefer/must rely on something outside.
Regardless, it is clearly a source of regular issues for users and seems
worth fixing. I'll take a look at your PR soon and go from there.
…On Fri, Sep 1, 2023, 7:26 PM ddelange ***@***.***> wrote:
py3-none-any.whl wheels (a wheel is just a zip file with a .whl
extension) can run on any oython 3.5+ distribution, on win, mac, and nix,
regardless of cpu architecture (aarch, x86_64, etc), because they only
contain python files and no compiled binaries. These are pure-python
libraries. If the code will run on both py2.7 and py3.5+, you can python
setup.py bdist_wheel --universal and you'll get a py2.py3-none-any.whl.
Any project that needs compiled binaries (cythonized, rust binaries, c++
backend etc), will publish wheels for a wealth of combinations of python
version (minor version specific ABI), OS and CPU architecture, containing
pre-compiled binaries that will execute on the target system. See for
instance this list
<catboost/catboost#2481 (comment)>
of popular python libraries.
I thought wheel files were used because they are the product of any
"build" step (setup.py etc) so don't need to execute any code to install?
That is correct, when wheels are available on PyPI, pip does not need to
execute setup.py, but can copy the python (and binary) files from the
wheel straight into site-packages. But as explained above, wheels hosted on
PyPI should be self-contained.
So in case of libmagic, it's either hosting only sdist on PyPI (so that a
missing libmagic will be detected on install time by assertion in
setup.py), or additionally hosting self-contained, system specific wheels
on PyPi containing precompiled libmagic binaries.
Does that make sense?
—
Reply to this email directly, view it on GitHub
<#233 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAAJ5EWW5BGAIKCW7GOYH43XYKKMTANCNFSM4WM3FKTA>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
I forked this fine code for a long while at https://github.com/nexB/typecode/blob/8e926684f260ce1cf7ffed74b2da99db97210f13/src/typecode/magic2.py
One of the key change is that I can provide a bundled pre-built binary of use a system-provided binary for libmagic and the magic db, which is not possible here.
For instance:
https://github.com/nexB/scancode-plugins/tree/develop/builtins/typecode_libmagic-linux and https://github.com/nexB/scancode-plugins/tree/develop/builtins/typecode_libmagic_system_provided
I would much prefer to fold that code back here at some point.
Would you be open to have a way to provide a libmagic and db path rather than always use the same heuristics code?
The text was updated successfully, but these errors were encountered: