Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Datasets] Add support for using Arrow 10 with Ray (Core/Datasets/AIR). #29997

Closed
clarkzinzow opened this issue Nov 3, 2022 · 2 comments · Fixed by #29999
Closed

[Datasets] Add support for using Arrow 10 with Ray (Core/Datasets/AIR). #29997

clarkzinzow opened this issue Nov 3, 2022 · 2 comments · Fixed by #29999
Assignees
Labels
data Ray Data-related issues enhancement Request for new feature and/or capability P1 Issue that should be fixed within a few weeks

Comments

@clarkzinzow
Copy link
Contributor

clarkzinzow commented Nov 3, 2022

Arrow 10 has some breaking API changes and bugs that we need to accommodate in order to make it usable with Ray.

Known Issues

Success Criteria

  • CI job running against Arrow 10 is passing
@clarkzinzow clarkzinzow added enhancement Request for new feature and/or capability P1 Issue that should be fixed within a few weeks air data Ray Data-related issues labels Nov 3, 2022
@clarkzinzow clarkzinzow added this to the Arrow 7+ Support milestone Nov 3, 2022
@clarkzinzow clarkzinzow self-assigned this Nov 3, 2022
@thatcort
Copy link

I've tried including both ray[air]==2.2.0 and pyarrow==10.0.1 in the same requirements.txt file but get the following error:

ERROR: Cannot install pyarrow==10.0.1 and ray[air]==2.2.0 because these package versions have conflicting dependencies.

The conflict is caused by:
    The user requested pyarrow==10.0.1
    ray[air] 2.2.0 depends on pyarrow<8.0.0 and >=6.0.1; extra == "air"

@clarkzinzow
Copy link
Contributor Author

clarkzinzow commented Jan 26, 2023

Hi @thatcort, unfortunately removal of the pyarrow<8.0.0 upperbound was missed in this PR, and the subsequent hotfix missed the Ray 2.2 release.

This will work as expected in the upcoming Ray 2.3 release. but if you're looking to unblock your workload, you could either:

  1. Use a nightly Ray wheel.
  2. Do an override install of pyarrow after installling ray:
pip install ray[air]==2.2.0
pip install pyarrow==10.0.1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
data Ray Data-related issues enhancement Request for new feature and/or capability P1 Issue that should be fixed within a few weeks
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants