Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[data] [3/3] [no_early_kickoff] Async iter batches e2e #33620

Merged
merged 91 commits into from
Mar 27, 2023

Conversation

amogkam
Copy link
Contributor

@amogkam amogkam commented Mar 23, 2023

Final PR for async iter_batches.

The new codepath is enabled for streaming execution. The old codepath is still accessible via a feature flag in DatasetContext.

Bulk execution still uses the old codepath by default.

This also deprecated prefetch_batches from map_batches since that is half baked and not entirely supported.

Why are these changes needed?

Related issue number

Checks

  • I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
    • I've added any new APIs to the API Reference. For example, if I added a
      method in Tune, I've added it in doc/source/tune/api/ under the
      corresponding .rst file.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested :(

Signed-off-by: amogkam <[email protected]>
Signed-off-by: amogkam <[email protected]>
Signed-off-by: amogkam <[email protected]>
Signed-off-by: amogkam <[email protected]>
Signed-off-by: amogkam <[email protected]>
Signed-off-by: amogkam <[email protected]>
Signed-off-by: amogkam <[email protected]>
Signed-off-by: amogkam <[email protected]>
Signed-off-by: amogkam <[email protected]>
Signed-off-by: amogkam <[email protected]>
Signed-off-by: amogkam <[email protected]>
Signed-off-by: amogkam <[email protected]>
Signed-off-by: amogkam <[email protected]>
Signed-off-by: amogkam <[email protected]>
Signed-off-by: amogkam <[email protected]>
Signed-off-by: amogkam <[email protected]>
Signed-off-by: amogkam <[email protected]>
Signed-off-by: amogkam <[email protected]>
Signed-off-by: amogkam <[email protected]>
Signed-off-by: amogkam <[email protected]>
Signed-off-by: amogkam <[email protected]>
Signed-off-by: amogkam <[email protected]>
Signed-off-by: amogkam <[email protected]>
Signed-off-by: amogkam <[email protected]>
Signed-off-by: amogkam <[email protected]>
Signed-off-by: amogkam <[email protected]>
@ericl ericl added the @author-action-required The PR author is responsible for the next step. Remove tag to send back to the reviewer. label Mar 25, 2023
Signed-off-by: amogkam <[email protected]>
Signed-off-by: amogkam <[email protected]>
Signed-off-by: amogkam <[email protected]>
Signed-off-by: amogkam <[email protected]>
Signed-off-by: amogkam <[email protected]>
Signed-off-by: amogkam <[email protected]>
Signed-off-by: amogkam <[email protected]>
Signed-off-by: amogkam <[email protected]>
Signed-off-by: amogkam <[email protected]>
@amogkam
Copy link
Contributor Author

amogkam commented Mar 27, 2023

Failing tests are also failing on master, going to merge.

@amogkam amogkam merged commit f19018e into ray-project:master Mar 27, 2023
@amogkam amogkam deleted the async-iter-batches-3 branch March 27, 2023 23:43
@amogkam amogkam mentioned this pull request Mar 28, 2023
8 tasks
amogkam added a commit that referenced this pull request Mar 28, 2023
#33713 changed test_dataset_stats_basic to check for iterations stats when using the streaming executor.

#33620 changed the stats behavior for streaming executor and hadn't pulled in #33713 so test_stats was passing.

When both are merged in master, test_stats started failing. This PR fixes it.

---------

Signed-off-by: amogkam <[email protected]>
elliottower pushed a commit to elliottower/ray that referenced this pull request Apr 22, 2023
…3620)

Final PR for async iter_batches.

The new codepath is enabled for streaming execution. The old codepath is still accessible via a feature flag in DatasetContext.

Bulk execution still uses the old codepath by default.

This also deprecated prefetch_batches from map_batches since that is half baked and not entirely supported.

---------

Signed-off-by: amogkam <[email protected]>
Signed-off-by: elliottower <[email protected]>
elliottower pushed a commit to elliottower/ray that referenced this pull request Apr 22, 2023
ray-project#33713 changed test_dataset_stats_basic to check for iterations stats when using the streaming executor.

ray-project#33620 changed the stats behavior for streaming executor and hadn't pulled in ray-project#33713 so test_stats was passing.

When both are merged in master, test_stats started failing. This PR fixes it.

---------

Signed-off-by: amogkam <[email protected]>
Signed-off-by: elliottower <[email protected]>
ProjectsByJackHe pushed a commit to ProjectsByJackHe/ray that referenced this pull request May 4, 2023
…3620)

Final PR for async iter_batches.

The new codepath is enabled for streaming execution. The old codepath is still accessible via a feature flag in DatasetContext.

Bulk execution still uses the old codepath by default.

This also deprecated prefetch_batches from map_batches since that is half baked and not entirely supported.

---------

Signed-off-by: amogkam <[email protected]>
Signed-off-by: Jack He <[email protected]>
ProjectsByJackHe pushed a commit to ProjectsByJackHe/ray that referenced this pull request May 4, 2023
ray-project#33713 changed test_dataset_stats_basic to check for iterations stats when using the streaming executor.

ray-project#33620 changed the stats behavior for streaming executor and hadn't pulled in ray-project#33713 so test_stats was passing.

When both are merged in master, test_stats started failing. This PR fixes it.

---------

Signed-off-by: amogkam <[email protected]>
Signed-off-by: Jack He <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
@author-action-required The PR author is responsible for the next step. Remove tag to send back to the reviewer.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants