Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revert "Revert "[Datasets] [Tensor Story - 1/2] Automatically provide tensor views to UDFs and infer tensor blocks for pure-tensor datasets."" #25031

Conversation

clarkzinzow
Copy link
Contributor

Fixes the check ingest utility to handle non-Pandas native batches.

Checks

  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested :(

… tensor views to UDFs and infer tensor blocks for pure-tensor datasets. (ray-project#24812)" (ray-project#25017)"

This reverts commit fbfb134.
elif isinstance(batch, np.ndarray):
num_bytes += batch.nbytes
else:
# NOTE: This isn't recursive and will just return the size of
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems this is the recommend recursive way: https://code.activestate.com/recipes/577504/: but I think we can leave it as a TODO.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I was going to open up a separate issue for this, since we currently calculate the byte size of simple blocks with this top-level sys.getsizeof() but we might want to use a recursive recipe both there and here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the future we can use BlockAccessor.for_block(b).size_bytes()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ericl For sure, but it's batch, not a block! So

block = BlockAccessor.batch_to_block(batch)
num_bytes += BlockAccessor.for_block(block).nbytes

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it.

@clarkzinzow
Copy link
Contributor Author

clarkzinzow commented May 20, 2022

ML tests now pass, and failing Datasets test is the flaky test that's reverted in master, so this looks good to merge. We can wait until MacOS CI jobs complete since this isn't time-sensitive.

@clarkzinzow clarkzinzow added the tests-ok The tagger certifies test failures are unrelated and assumes personal liability. label May 20, 2022
@ericl ericl merged commit 9ea5a8e into ray-project:master May 20, 2022
mwtian added a commit that referenced this pull request May 21, 2022
… provide tensor views to UDFs and infer tensor blocks for pure-tensor datasets."" (#25031)"

This reverts commit 9ea5a8e.
scv119 pushed a commit that referenced this pull request May 26, 2022
… provide tensor views to UDFs and infer tensor blocks for pure-tensor datasets."" (#25031)" (#25057)

Reverts #25031

It looks to be still somewhat flaky.
clarkzinzow added a commit to clarkzinzow/ray that referenced this pull request Jun 7, 2022
…atically provide tensor views to UDFs and infer tensor blocks for pure-tensor datasets."" (ray-project#25031)" (ray-project#25057)"

This reverts commit fb2933a.
clarkzinzow added a commit that referenced this pull request Jun 8, 2022
…ovide tensor views to UDFs and infer tensor blocks for pure-tensor datasets. (#25031)"  (#25531)

Unreverts #24812, skipping the memory releasing tests that are already flaky. We have a separate issue tracking the unskipping of these memory releasing tests, once we find a more reliable way to test them.

* Revert "Revert "Revert "Revert "[Datasets] [Tensor Story - 1/2] Automatically provide tensor views to UDFs and infer tensor blocks for pure-tensor datasets."" (#25031)" (#25057)"

This reverts commit fb2933a.

* Skip shuffle memory release test.
sumanthratna pushed a commit to sumanthratna/ray that referenced this pull request Jun 8, 2022
…ovide tensor views to UDFs and infer tensor blocks for pure-tensor datasets. (ray-project#25031)"  (ray-project#25531)

Unreverts ray-project#24812, skipping the memory releasing tests that are already flaky. We have a separate issue tracking the unskipping of these memory releasing tests, once we find a more reliable way to test them.

* Revert "Revert "Revert "Revert "[Datasets] [Tensor Story - 1/2] Automatically provide tensor views to UDFs and infer tensor blocks for pure-tensor datasets."" (ray-project#25031)" (ray-project#25057)"

This reverts commit fb2933a.

* Skip shuffle memory release test.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
tests-ok The tagger certifies test failures are unrelated and assumes personal liability.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants