Time streaming exec scheduling #43112

omatthew98 · 2024-02-12T19:58:28Z

Why are these changes needed?

Currently we are not timing how much time is spent during scheduling in the streaming executor. This times the total process_time for the scheduling steps / calls to _scheduling_loop_step. This stat is included in DatasetStats and a later PR will include this and other StreamingExecutor stats into the DatasetStatsSummary.

Related issue number

Closes #42797

Checks

I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
I've run scripts/format.sh to lint the changes in this PR.
I've included any doc changes needed for https://docs.ray.io/en/master/.
- I've added any new APIs to the API Reference. For example, if I added a
  method in Tune, I've added it in doc/source/tune/api/ under the
  corresponding .rst file.
I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
Testing Strategy
- Manual testing
- Unit tests
- Release tests
- This PR is not tested :(

Signed-off-by: Matthew Owen <[email protected]>

omatthew98 · 2024-02-12T23:43:13Z

Conducted some manual testing with the following snippet (and adding a line to log the time from within StreamingExecutor.run):

def sleep(x):
    time.sleep(0.5)
    return x

num_rows = sys.argv[1] if len(sys.argv) > 1 else 10

ds = ray.data.range(num_rows).map(sleep)

for _ in ds.iter_batches(batch_size=1):
    continue

For num_rows = 100 the total scheduling time was 0.23742400000000297 and for num_rows = 1000 the total scheduling time was 1.063609000000004. More comprehensive testing will be added in followup pr which will update the metrics being reported out through DatasetStatsSummary.

Signed-off-by: Matthew Owen <[email protected]>

python/ray/data/_internal/execution/streaming_executor.py

Signed-off-by: Matthew Owen <[email protected]>

can-anyscale · 2024-02-19T03:24:25Z

Seem to broke linux://python/ray/data:test_streaming_integration

can-anyscale · 2024-02-20T16:48:58Z

I'm trying to revert to unblock test failures

This reverts commit 9641c72.

…)" This reverts commit 2b92f57.

)" (ray-project#43283)" This reverts commit 2b92f57. Signed-off-by: Matthew Owen <[email protected]>

…)" (#43433) This adds an extra `None` check to fix test failures if `self._initial_stats` is not set. This reverts #43283 and restores the changes made in #43112 . Signed-off-by: Matthew Owen <[email protected]>

omatthew98 added 2 commits February 12, 2024 11:47

timing streaming exec scheduling

c1e2c6d

Signed-off-by: Matthew Owen <[email protected]>

time.perf_counter -> time.process_time to avoid timing wait

aa93fe2

Signed-off-by: Matthew Owen <[email protected]>

omatthew98 assigned scottjlee Feb 12, 2024

fixing format

670394c

Signed-off-by: Matthew Owen <[email protected]>

omatthew98 marked this pull request as ready for review February 13, 2024 00:34

omatthew98 requested review from ericl, scv119, c21, amogkam, scottjlee, bveeramani, raulchen and stephanie-wang as code owners February 13, 2024 00:34

scottjlee approved these changes Feb 13, 2024

View reviewed changes

c21 assigned raulchen Feb 13, 2024

adding in basic test, propagate stats from initial stats to final stats

517261a

Signed-off-by: Matthew Owen <[email protected]>

raulchen reviewed Feb 15, 2024

View reviewed changes

python/ray/data/_internal/execution/streaming_executor.py Show resolved Hide resolved

respond to pr feedback

c988310

Signed-off-by: Matthew Owen <[email protected]>

omatthew98 requested a review from raulchen February 15, 2024 20:42

Merge branch 'master' into time-streaming-exec-sched

436a5b4

Signed-off-by: Matthew Owen <[email protected]>

omatthew98 assigned c21 Feb 16, 2024

c21 approved these changes Feb 16, 2024

View reviewed changes

c21 merged commit 9641c72 into ray-project:master Feb 16, 2024
9 checks passed

can-anyscale added a commit that referenced this pull request Feb 20, 2024

Revert "[Data] Time streaming exec scheduling (#43112)"

4876d1a

This reverts commit 9641c72.

can-anyscale mentioned this pull request Feb 20, 2024

Revert "Time streaming exec scheduling" #43283

Merged

can-anyscale added a commit that referenced this pull request Feb 20, 2024

Revert "[Data] Time streaming exec scheduling (#43112)" (#43283)

2b92f57

This reverts commit 9641c72.

khluu pushed a commit that referenced this pull request Feb 21, 2024

Revert "[Data] Time streaming exec scheduling (#43112)" (#43283)

97bb3f7

This reverts commit 9641c72.

omatthew98 added a commit that referenced this pull request Feb 26, 2024

Revert "Revert "[Data] Time streaming exec scheduling (#43112)" (#43283…

30dfbf3

…)" This reverts commit 2b92f57.

omatthew98 mentioned this pull request Feb 26, 2024

Revert "Revert "[Data] Time streaming exec scheduling (#43112)" (#43283)" #43433

Merged

8 tasks

omatthew98 added a commit to omatthew98/ray that referenced this pull request Feb 26, 2024

Revert "Revert "[Data] Time streaming exec scheduling (ray-project#43112

11b5862

)" (ray-project#43283)" This reverts commit 2b92f57. Signed-off-by: Matthew Owen <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Time streaming exec scheduling #43112

Time streaming exec scheduling #43112

omatthew98 commented Feb 12, 2024 •

edited

Loading

omatthew98 commented Feb 12, 2024

can-anyscale commented Feb 19, 2024

can-anyscale commented Feb 20, 2024

Time streaming exec scheduling #43112

Time streaming exec scheduling #43112

Conversation

omatthew98 commented Feb 12, 2024 • edited Loading

Why are these changes needed?

Related issue number

Checks

omatthew98 commented Feb 12, 2024

can-anyscale commented Feb 19, 2024

can-anyscale commented Feb 20, 2024

omatthew98 commented Feb 12, 2024 •

edited

Loading