Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Convert python API into a dynamic-link library for Linux #679

Open
7012xxx opened this issue Sep 10, 2024 · 14 comments
Open

Convert python API into a dynamic-link library for Linux #679

7012xxx opened this issue Sep 10, 2024 · 14 comments

Comments

@7012xxx
Copy link

7012xxx commented Sep 10, 2024

How can I build this Python API into a dynamic-link library for Linux, generating a file with the suffix '.so'?

@7012xxx 7012xxx changed the title Python API into a dynamic-link library for Linux Convert python API into a dynamic-link library for Linux Sep 10, 2024
@reyammer
Copy link
Collaborator

I believe the rust library would be useful for this use case? /cc @ia0?

@ia0
Copy link
Member

ia0 commented Sep 12, 2024

I'm not sure I fully understand the initial question. There's at least 2 ways to interpret it for someone like me unfamiliar with Python:

  • Provide a dynamic library of the Python API to be used in Python. (That's the part I don't know is possible.)
  • Provide a dynamic library of a C API similar to the Python API to be used by any language that dynamically links to C. (I'm also not sure if it's possible to do this from the Python library. It can probably be done from the Rust library although that depends on ort support. The API might also not really match the Python one.)

So I guess the best would be to know what the problem is rather than what a possible solution could be (XY problem).

@7012xxx could you help us understand what you are trying to do for which you believe a dynamic library could help? Thanks!

@7012xxx
Copy link
Author

7012xxx commented Sep 13, 2024

I'm not sure I fully understand the initial question. There's at least 2 ways to interpret it for someone like me unfamiliar with Python:

  • Provide a dynamic library of the Python API to be used in Python. (That's the part I don't know is possible.)
  • Provide a dynamic library of a C API similar to the Python API to be used by any language that dynamically links to C. (I'm also not sure if it's possible to do this from the Python library. It can probably be done from the Rust library although that depends on ort support. The API might also not really match the Python one.)

So I guess the best would be to know what the problem is rather than what a possible solution could be (XY problem).

@7012xxx could you help us understand what you are trying to do for which you believe a dynamic library could help? Thanks!

Thank you for your reply. I am looking to use Magika in an environment that does not have Python >=3.8. Additionally, I want to call Magika using both Golang and Python. During my research, I discovered that dynamic link libraries might meet my needs, enabling seamless calls between Golang and Python.

@ia0
Copy link
Member

ia0 commented Sep 13, 2024

Thanks, so it looks like for the Golang use-case, we have something planned: Providing a C API to the Rust library (ideally as both a static and dynamic library). But this still needs to be done. For the Python use-case, either the same library could be used (although since Python is not a compiled language, the static library probably won't be an option), or we could provide Python bindings to the Rust library using PyO3 and Maturin. This too would need to be designed and implemented.

You can follow #96 for the Golang use-case. For Python, we'll have to decide if we go with C (thus #90) or with PyO3 (which would required a new issue).

@reyammer
Copy link
Collaborator

My take for the python part (feedback is welcome):

  • python-wise, magika already supports >=3.8, which are all currently supported python versions out there (python <=3.7 is out of support). I think it's OK we don't have "easy support" for such older versions.
  • once one has a shared object .so with the main functionality (which we should be able to generate from the rust codebase, right @ia0?), it should be very easy to create a simple python wrapper around that (e.g., with ctypes).
  • we could provide such "python bindings around a shared object" (in addition to the Magika python module we already have), but we have many more higher priority aspects to take care of, and not sure how realistic it is we'll find time any time soon.

@7012xxx: would the ctypes route around an .so file work for you?

@ia0
Copy link
Member

ia0 commented Sep 13, 2024

  • once one has a shared object .so with the main functionality (which we should be able to generate from the rust codebase, right @ia0?)

For Rust in general yes, but for this particular case depending on ONNX through the ort crate, I don't know. We would need to test it, but it's indeed part of the plan to try. I expect a static library to be simpler. I'm expecting to track this in #90.

@7012xxx
Copy link
Author

7012xxx commented Sep 22, 2024

I am eager to obtain the cdylib (i.e., so file) I need by using Rust. However, I lack relevant background knowledge in Rust. During the compilation process, I ran into a problem. Namely, I can't directly generate the so file under the /rust directory via "cargo build --release". It prompts me that the toml file needs to be edited. After several attempts after editing, I still can't successfully compile. Could you please provide a toml file example or help me compile magika into an so file?

@7012xxx
Copy link
Author

7012xxx commented Sep 22, 2024

I am eager to obtain the cdylib (i.e., so file) I need by using Rust. However, I lack relevant background knowledge in Rust. During the compilation process, I ran into a problem. Namely, I can't directly generate the so file under the /rust directory via "cargo build --release". It prompts me that the toml file needs to be edited. After several attempts after editing, I still can't successfully compile. Could you please provide a toml file example or help me compile magika into an so file?

I want to use Magika without relying on any other libraries and language environments.

@ia0
Copy link
Member

ia0 commented Sep 23, 2024

Here's an example on how to build a dynamic library in Rust and use it from C. Ultimately we'll build such a magika-api crate and provide a .h and .so file. But this is not yet on the table. This small example should provide you all the knowledge you need to build a C API that suits your needs.

@ia0
Copy link
Member

ia0 commented Oct 8, 2024

Hello, ia0.
I have successfully compiled magika into a.so file according to the steps you provided. However, during the process of using the C API, I encountered the following error:

     python pyMagikaDemo.py libmagika_api.so
    Traceback (most recent call last):
      File "pyMagikaDemo.py", line 6, in <module>
        libmagika = ctypes.CDLL('./libmagika_api.so')
      File "/usr/lib64/python2.7/ctypes/__init__.py", line 360, in __init__
        self._handle = _dlopen(self._name, mode)
    OSError:./libmagika_api.so: undefined symbol: _ZNKSt19__codecvt_utf8_baseIwE6do_outER11__mbstate_tPKwS4_RS4_PcS6_RS6_.

After my inspection, I found that this symbol is missing in both./libmagika_api.so and libstdc++.so.6. Is this problem caused by not importing all the correct dependencies?

I'm following up here instead of by email because others might be to help you much better than me. _ZNKSt19__codecvt_utf8_baseIwE6do_outER11__mbstate_tPKwS4_RS4_PcS6_RS6_ should be in libstdc++.so.6. Does your magika depend on libstdc++.so.6? Mine does:

% readelf -d ../target/release/libmagika_api.so
[...]
 0x0000000000000001 (NEEDED)             Shared library: [libstdc++.so.6]
[...]

In particular, did you manage to successfully run make ARG=src/lib.rs on your machine in my branch?

@7012xxx
Copy link
Author

7012xxx commented Oct 10, 2024

Hello, ia0.
I have successfully compiled magika into a.so file according to the steps you provided. However, during the process of using the C API, I encountered the following error:

     python pyMagikaDemo.py libmagika_api.so
    Traceback (most recent call last):
      File "pyMagikaDemo.py", line 6, in <module>
        libmagika = ctypes.CDLL('./libmagika_api.so')
      File "/usr/lib64/python2.7/ctypes/__init__.py", line 360, in __init__
        self._handle = _dlopen(self._name, mode)
    OSError:./libmagika_api.so: undefined symbol: _ZNKSt19__codecvt_utf8_baseIwE6do_outER11__mbstate_tPKwS4_RS4_PcS6_RS6_.

After my inspection, I found that this symbol is missing in both./libmagika_api.so and libstdc++.so.6. Is this problem caused by not importing all the correct dependencies?

I'm following up here instead of by email because others might be to help you much better than me. _ZNKSt19__codecvt_utf8_baseIwE6do_outER11__mbstate_tPKwS4_RS4_PcS6_RS6_ should be in libstdc++.so.6. Does your magika depend on libstdc++.so.6? Mine does:

% readelf -d ../target/release/libmagika_api.so
[...]
 0x0000000000000001 (NEEDED)             Shared library: [libstdc++.so.6]
[...]

In particular, did you manage to successfully run make ARG=src/lib.rs on your machine in my branch?

 [. . .]
    Finished `release` profile [optimized] target(s) in 1m 29s
gcc example.c -o example -lmagika_api -L../target/release
../target/release/libmagika_api.so: undefined reference to `std::out_of_range::out_of_range(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)'
../target/release/libmagika_api.so: undefined reference to `std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_replace(unsigned long, unsigned long, char const*, unsigned long)'

I can successfully complete the compilation of the .so file. However, I cannot complete the compilation using gcc. It appears that libmagika_api.so is not linked successfully during the gcc compilation process. I believe this does not impact my usage of the .so file, so I haven't paid much attention to this issue.

@ia0
Copy link
Member

ia0 commented Oct 10, 2024

It looks to me like a problem with your platform. Maybe it's too old or it's missing some library. If you can't resolve it on your own, your best bet is to wait until we support this use-case. This is currently not supported.

@7012xxx
Copy link
Author

7012xxx commented Oct 10, 2024

The following are my Linux platform parameters.

NAME="CentOS Linux"
VERSION="7 (Core)"
ID="centos"
ID_LIKE="rhel fedora"
VERSION_ID="7"
PRETTY_NAME="CentOS Linux 7 (Core)"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:centos:centos:7"
HOME_URL="https://www.centos.org/"
BUG_REPORT_URL="https://bugs.centos.org/"

CENTOS_MANTISBT_PROJECT="CentOS-7"
CENTOS_MANTISBT_PROJECT_VERSION="7"
REDHAT_SUPPORT_PRODUCT="centos"
REDHAT_SUPPORT_PRODUCT_VERSION="7"

The following is the dependency relationship of the libmagika_api.so that I compiled. As can be seen, libstdc++.so.6 has been installed.

	linux-vdso.so.1 =>  (0x00007ffec5b85000)
	libstdc++.so.6 => /lib64/libstdc++.so.6 (0x00007f2a70ec5000)
	libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007f2a70caf000)
	librt.so.1 => /lib64/librt.so.1 (0x00007f2a70aa7000)
	libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f2a7088b000)
	libm.so.6 => /lib64/libm.so.6 (0x00007f2a70589000)
	libdl.so.2 => /lib64/libdl.so.2 (0x00007f2a70385000)
	libc.so.6 => /lib64/libc.so.6 (0x00007f2a6ffb7000)
	/lib64/ld-linux-x86-64.so.2 (0x00007f2a72a0c000)

It appears that everything seems to be in order. As a result, I truly cannot determine if there are any missing partial dependent libraries. Do you have any suggestions?

[root@sz-platform-operation-1 lib64]# strings /lib64/libstdc++.so.6 | grep LIBCXX
GLIBCXX_3.4
GLIBCXX_3.4.1
GLIBCXX_3.4.2
GLIBCXX_3.4.3
GLIBCXX_3.4.4
GLIBCXX_3.4.5
GLIBCXX_3.4.6
GLIBCXX_3.4.7
GLIBCXX_3.4.8
GLIBCXX_3.4.9
GLIBCXX_3.4.10
GLIBCXX_3.4.11
GLIBCXX_3.4.12
GLIBCXX_3.4.13
GLIBCXX_3.4.14
GLIBCXX_3.4.15
GLIBCXX_3.4.16
GLIBCXX_3.4.17
GLIBCXX_3.4.18
GLIBCXX_3.4.19
GLIBCXX_DEBUG_MESSAGE_LENGTH
[root@sz-platform-operation-1 lib64]# g++ --version
g++ (GCC) 4.8.5 20150623 (Red Hat 4.8.5-39)
Copyright (C) 2015 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

Is there a compatibility issue between the C++ standard library and GCC here?

@7012xxx
Copy link
Author

7012xxx commented Oct 14, 2024

@ia0 Could you provide the platform parameters when you compile with cargo? For example, the C++ version and Linux version. Also, which libraries does the compiled libmagika_api.so depend on?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants