Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GH-25118: [Python] Make NumPy an optional runtime dependency #41904

Merged
merged 66 commits into from
Sep 2, 2024

Conversation

raulcd
Copy link
Member

@raulcd raulcd commented May 31, 2024

Rationale for this change

Being able to run pyarrow without requiring numpy.

What changes are included in this PR?

If numpy is not present we are able to import pyarrow and run functionality.
A new CI job has been created to run some basic tests without numpy.

Are these changes tested?

Yes via CI.

Are there any user-facing changes?

Yes, NumPy can be removed from the user installation and pyarrow functionality still works

@raulcd raulcd force-pushed the GH-25118 branch 2 times, most recently from cad9bef to bccf733 Compare June 6, 2024 09:47
Copy link
Member

@jorisvandenbossche jorisvandenbossche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am a bit concerned about the maintenance effort for the tests and having to remember to add @pytest.mark.numpy everywhere needed.
I am wondering: could we have a nonumpy marker instead, and label a subset of tests we explicitly want to run in the CI build without numpy? (of course, if that become a lot of tests, that will also be unwieldy) But then at least by default we don't have to think about marking it with numpy, and it's only when specifically working on tests for numpy being optional we have to add a marker.

Another idea would be to split certain test files in two where we could have a single marker at the top of the file indicating it's using numpy or not, instead of having to mark each individual test.

Just brainstorming a bit!

python/pyarrow/_compute.pyx Outdated Show resolved Hide resolved
python/pyarrow/builder.pxi Outdated Show resolved Hide resolved
python/pyarrow/src/arrow/python/init.cc Outdated Show resolved Hide resolved
python/pyarrow/tests/test_adhoc_memory_leak.py Outdated Show resolved Hide resolved
@github-actions github-actions bot added awaiting changes Awaiting changes and removed awaiting committer review Awaiting committer review labels Jun 6, 2024
.github/workflows/python.yml Outdated Show resolved Hide resolved
python/pyarrow/lib.pyx Outdated Show resolved Hide resolved
python/pyarrow/builder.pxi Outdated Show resolved Hide resolved
python/pyarrow/src/arrow/python/init.cc Outdated Show resolved Hide resolved
python/pyarrow/lib.pyx Outdated Show resolved Hide resolved
python/pyarrow/tests/conftest.py Outdated Show resolved Hide resolved
python/pyarrow/tests/strategies.py Outdated Show resolved Hide resolved
python/pyarrow/tests/test_compute.py Outdated Show resolved Hide resolved
python/pyarrow/tests/util.py Outdated Show resolved Hide resolved
python/pyarrow/types.pxi Outdated Show resolved Hide resolved
python/pyarrow/lib.pyx Outdated Show resolved Hide resolved
@github-actions github-actions bot added awaiting changes Awaiting changes awaiting change review Awaiting change review and removed awaiting change review Awaiting change review awaiting changes Awaiting changes labels Jun 12, 2024
@github-actions github-actions bot added awaiting change review Awaiting change review and removed awaiting changes Awaiting changes labels Jun 19, 2024
@github-actions github-actions bot added awaiting changes Awaiting changes awaiting change review Awaiting change review and removed awaiting change review Awaiting change review awaiting changes Awaiting changes labels Jun 20, 2024
@pitrou
Copy link
Member

pitrou commented Sep 2, 2024

@github-actions crossbow submit -g wheel -g python

Copy link
Member

@pitrou pitrou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, let's just wait for CI. Great work Raul!

Copy link

github-actions bot commented Sep 2, 2024

Revision: 08da867

Submitted crossbow builds: ursacomputing/crossbow @ actions-e575e4ce7c

Task Status
example-python-minimal-build-fedora-conda GitHub Actions
example-python-minimal-build-ubuntu-venv GitHub Actions
test-conda-python-3.10 GitHub Actions
test-conda-python-3.10-cython2 GitHub Actions
test-conda-python-3.10-hdfs-2.9.2 GitHub Actions
test-conda-python-3.10-hdfs-3.2.1 GitHub Actions
test-conda-python-3.10-pandas-latest-numpy-1.26 GitHub Actions
test-conda-python-3.10-pandas-latest-numpy-latest GitHub Actions
test-conda-python-3.10-pandas-nightly-numpy-nightly GitHub Actions
test-conda-python-3.10-substrait GitHub Actions
test-conda-python-3.11 GitHub Actions
test-conda-python-3.11-dask-latest GitHub Actions
test-conda-python-3.11-dask-upstream_devel GitHub Actions
test-conda-python-3.11-hypothesis GitHub Actions
test-conda-python-3.11-pandas-upstream_devel-numpy-nightly GitHub Actions
test-conda-python-3.11-spark-master GitHub Actions
test-conda-python-3.12 GitHub Actions
test-conda-python-3.12-cpython-debug GitHub Actions
test-conda-python-3.8 GitHub Actions
test-conda-python-3.8-pandas-1.0-numpy-1.19 GitHub Actions
test-conda-python-3.9 GitHub Actions
test-conda-python-3.9-pandas-latest-numpy-latest GitHub Actions
test-conda-python-emscripten GitHub Actions
test-cuda-python GitHub Actions
test-debian-12-python-3-amd64 GitHub Actions
test-debian-12-python-3-i386 GitHub Actions
test-fedora-39-python-3 GitHub Actions
test-ubuntu-20.04-python-3 GitHub Actions
test-ubuntu-22.04-python-3 GitHub Actions
wheel-macos-monterey-cp310-amd64 GitHub Actions
wheel-macos-monterey-cp310-arm64 GitHub Actions
wheel-macos-monterey-cp311-amd64 GitHub Actions
wheel-macos-monterey-cp311-arm64 GitHub Actions
wheel-macos-monterey-cp312-amd64 GitHub Actions
wheel-macos-monterey-cp312-arm64 GitHub Actions
wheel-macos-monterey-cp313-amd64 GitHub Actions
wheel-macos-monterey-cp313-arm64 GitHub Actions
wheel-macos-monterey-cp38-amd64 GitHub Actions
wheel-macos-monterey-cp38-arm64 GitHub Actions
wheel-macos-monterey-cp39-amd64 GitHub Actions
wheel-macos-monterey-cp39-arm64 GitHub Actions
wheel-manylinux-2-28-cp310-amd64 GitHub Actions
wheel-manylinux-2-28-cp310-arm64 GitHub Actions
wheel-manylinux-2-28-cp311-amd64 GitHub Actions
wheel-manylinux-2-28-cp311-arm64 GitHub Actions
wheel-manylinux-2-28-cp312-amd64 GitHub Actions
wheel-manylinux-2-28-cp312-arm64 GitHub Actions
wheel-manylinux-2-28-cp313-amd64 GitHub Actions
wheel-manylinux-2-28-cp313-arm64 GitHub Actions
wheel-manylinux-2-28-cp38-amd64 GitHub Actions
wheel-manylinux-2-28-cp38-arm64 GitHub Actions
wheel-manylinux-2-28-cp39-amd64 GitHub Actions
wheel-manylinux-2-28-cp39-arm64 GitHub Actions
wheel-manylinux-2014-cp310-amd64 GitHub Actions
wheel-manylinux-2014-cp310-arm64 GitHub Actions
wheel-manylinux-2014-cp311-amd64 GitHub Actions
wheel-manylinux-2014-cp311-arm64 GitHub Actions
wheel-manylinux-2014-cp312-amd64 GitHub Actions
wheel-manylinux-2014-cp312-arm64 GitHub Actions
wheel-manylinux-2014-cp313-amd64 GitHub Actions
wheel-manylinux-2014-cp313-arm64 GitHub Actions
wheel-manylinux-2014-cp38-amd64 GitHub Actions
wheel-manylinux-2014-cp38-arm64 GitHub Actions
wheel-manylinux-2014-cp39-amd64 GitHub Actions
wheel-manylinux-2014-cp39-arm64 GitHub Actions
wheel-windows-cp310-amd64 GitHub Actions
wheel-windows-cp311-amd64 GitHub Actions
wheel-windows-cp312-amd64 GitHub Actions
wheel-windows-cp313-amd64 GitHub Actions
wheel-windows-cp38-amd64 GitHub Actions
wheel-windows-cp39-amd64 GitHub Actions

@pitrou pitrou merged commit 9ab9532 into apache:main Sep 2, 2024
59 of 63 checks passed
@pitrou pitrou removed the awaiting merge Awaiting merge label Sep 2, 2024
Copy link

After merging your PR, Conbench analyzed the 4 benchmarking runs that have been run so far on merge-commit 9ab9532.

There were no benchmark performance regressions. 🎉

The full Conbench report has more details. It also includes information about 1 possible false positive for unstable benchmarks that are known to sometimes produce them.

@jorisvandenbossche
Copy link
Member

Thanks @raulcd!

mapleFU pushed a commit to mapleFU/arrow that referenced this pull request Sep 3, 2024
…pache#41904)

### Rationale for this change

Being able to run pyarrow without requiring numpy.

### What changes are included in this PR?

If numpy is not present we are able to import pyarrow and run functionality.
A new CI job has been created to run some basic tests without numpy.

### Are these changes tested?

Yes via CI.

### Are there any user-facing changes?

Yes, NumPy can be removed from the user installation and pyarrow functionality still works

* GitHub Issue: apache#25118

Lead-authored-by: Raúl Cumplido <[email protected]>
Co-authored-by: Joris Van den Bossche <[email protected]>
Co-authored-by: Antoine Pitrou <[email protected]>
Signed-off-by: Antoine Pitrou <[email protected]>
zanmato1984 pushed a commit to zanmato1984/arrow that referenced this pull request Sep 6, 2024
…pache#41904)

### Rationale for this change

Being able to run pyarrow without requiring numpy.

### What changes are included in this PR?

If numpy is not present we are able to import pyarrow and run functionality.
A new CI job has been created to run some basic tests without numpy.

### Are these changes tested?

Yes via CI.

### Are there any user-facing changes?

Yes, NumPy can be removed from the user installation and pyarrow functionality still works

* GitHub Issue: apache#25118

Lead-authored-by: Raúl Cumplido <[email protected]>
Co-authored-by: Joris Van den Bossche <[email protected]>
Co-authored-by: Antoine Pitrou <[email protected]>
Signed-off-by: Antoine Pitrou <[email protected]>
khwilson pushed a commit to khwilson/arrow that referenced this pull request Sep 14, 2024
…pache#41904)

### Rationale for this change

Being able to run pyarrow without requiring numpy.

### What changes are included in this PR?

If numpy is not present we are able to import pyarrow and run functionality.
A new CI job has been created to run some basic tests without numpy.

### Are these changes tested?

Yes via CI.

### Are there any user-facing changes?

Yes, NumPy can be removed from the user installation and pyarrow functionality still works

* GitHub Issue: apache#25118

Lead-authored-by: Raúl Cumplido <[email protected]>
Co-authored-by: Joris Van den Bossche <[email protected]>
Co-authored-by: Antoine Pitrou <[email protected]>
Signed-off-by: Antoine Pitrou <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants