No spreading if a node is selected for lease request due to locality #22015

jjyao · 2022-01-31T23:07:00Z

Why are these changes needed?

If the node is selected based on locality, we always run the task on the node selected by locality if the node is available.
For spread scheduling strategy, we always select the local node as the first raylet to request lease, no locality involved.

Related issue number

Closes #18581

Checks

I've run scripts/format.sh to lint the changes in this PR.
I've included any doc changes needed for https://docs.ray.io/en/master/.
I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
Testing Strategy
- Unit tests
- Release tests
- This PR is not tested :(

jjyao · 2022-01-31T23:08:51Z

I need to add tests, etc but I'd like to request an early review to make sure the overall approach is good.

ericl · 2022-01-31T23:50:42Z

src/ray/core_worker/transport/direct_task_transport.cc

    // If no raylet address is given, find the best worker for our next lease request.
-    best_node_address = lease_policy_->GetBestNodeForTask(resource_spec);
+    std::tie(best_node_address, is_selected_based_on_locality) =


Shouldn't we only run this logic if the scheduling strategy is DEFAULT?

Currently that check is inside lease_policy: it checks the scheduling strategy and decide if it needs to run locality logic.

ericl

To make sure, the desired precedence is:

SPREAD strategy => always spread, ignore locality complete
DEFAULT / locality present => set spread threshold to 1.0 (no spreading at all)
DEFAULT / no-locality => default spread threshold

Is this right?

jjyao · 2022-02-01T03:34:27Z

To make sure, the desired precedence is:

SPREAD strategy => always spread, ignore locality complete

DEFAULT / locality present => set spread threshold to 1.0 (no spreading at all)

DEFAULT / no-locality => default spread threshold

Is this right?

That's right.

src/ray/core_worker/transport/direct_task_transport.cc

src/ray/protobuf/common.proto

stephanie-wang · 2022-02-01T18:48:14Z

Is there a way we can set the spread threshold only on the owner side? So that way the scheduler just takes the spread threshold directly from the TaskSpec?

jjyao · 2022-02-02T01:03:47Z

@ericl @stephanie-wang Updated, now only scheduler decides spread threshold.

ericl · 2022-02-02T01:10:58Z

src/ray/raylet/scheduling/cluster_task_manager.cc

@@ -1558,7 +1558,8 @@ std::string ClusterTaskManager::GetBestSchedulableNode(const internal::Work &wor
                                                       bool *is_infeasible) {
  // If the local node is available, we should directly return it instead of
  // going through the full hybrid policy since we don't want spillback.
-  if (work.grant_or_reject && !force_spillback && IsLocallySchedulable(work.task)) {
+  if ((work.grant_or_reject || work.is_selected_based_on_locality) && !force_spillback &&


Would it make sense to unify grant_or_reject and is_selected_based_on_locality?

They are the same if the node is available. But if the node is not available, their behavior is different, one is spill, the other is reject. I think it might be clearer to separate them.

jjyao · 2022-02-02T17:32:03Z

This improves the locality aware scheduling but doesn't fix the fundamental problem: resource view and object view are separated in two places (owner core worker and raylet). Without having these information in a single place, we have the following issues:

Spillback is not locality aware: ideally if the best locality node is not available, we may want to spill back to the second best locality node.
It's possible that best locality node doesn't have the resources needed by the task (e.g. task needs custom resource A but best locality node doesn't have it), however core worker will still request the lease from that node.
Actor scheduling is not locality aware.

It's possible those issues won't cause much problem in real workload and this PR is enough but just something to keep in mind when we refactor scheduler. cc @scv119 @iycheng

jjyao · 2022-02-02T17:33:03Z

Don't merge it yet since I want to run nightly tests.

clarkzinzow · 2022-02-03T17:57:20Z

FYI @jjyao for (1), we have an open issue for it from the original locality-aware scheduling work and a basic implementation idea (on spillback, raylet returns all available nodes and the core worker chooses the best locality node from that set) but it has never been prioritized since we haven't created a test workload that demonstrates the need for it. It also predates the hybrid scheduling policy and wouldn't work very well with it (breaks under-threshold round-robin packing), so implementing a locality-aware spillback policy within the hybrid scheduling policy would probably yield much better results.

(2) is a great point that's always bothered me about this design, and we've heard (3) requested from users before. Really looking forward to addressing all of these with the redesign!

clarkzinzow

LGTM, great tests!

python/ray/tests/test_scheduling.py

ericl · 2022-02-03T20:01:17Z

Ah... I didn't see the label, sorry! Hopefully it works with the nightly tests.

jjyao · 2022-02-03T20:47:03Z

Nightly tests are broken now. Hopefully this works otherwise we can revert it. I'll keep an eye on the nightly tests once it's recovered.

Set spread threshold to 1.0 for locality scheduling

c2741e5

jjyao requested review from AmeerHajAli, ericl, pcmoritz, raulchen, robertnishihara and wuisawesome as code owners January 31, 2022 23:07

jjyao assigned ericl, scv119, stephanie-wang and clarkzinzow Jan 31, 2022

ericl reviewed Jan 31, 2022

View reviewed changes

ericl added the @author-action-required The PR author is responsible for the next step. Remove tag to send back to the reviewer. label Jan 31, 2022

Set spread threshold to 1.0 for locality scheduling

b55d079

clarkzinzow reviewed Feb 1, 2022

View reviewed changes

src/ray/core_worker/transport/direct_task_transport.cc Outdated Show resolved Hide resolved

src/ray/protobuf/common.proto Outdated Show resolved Hide resolved

jjyao added 3 commits February 1, 2022 14:37

Set spread threshold to 1.0 for locality scheduling

21caf14

Merge branch 'master' of github.com:ray-project/ray into jjyao/locality

3a0bd7d

No spreading if a node is selected for lease request due to locality

897d176

jjyao changed the title ~~[WIP] Set spread threshold to 1.0 for locality scheduling~~ [WIP] No spreading if a node is selected for lease request due to locality Feb 2, 2022

ericl reviewed Feb 2, 2022

View reviewed changes

fix

c7ddbc0

jjyao added the do-not-merge Do not merge this PR! label Feb 2, 2022

fix

a4b3fe3

jjyao changed the title ~~[WIP] No spreading if a node is selected for lease request due to locality~~ No spreading if a node is selected for lease request due to locality Feb 2, 2022

jjyao removed the @author-action-required The PR author is responsible for the next step. Remove tag to send back to the reviewer. label Feb 2, 2022

jjyao requested review from ericl and clarkzinzow February 2, 2022 17:20

clarkzinzow approved these changes Feb 3, 2022

View reviewed changes

python/ray/tests/test_scheduling.py Show resolved Hide resolved

ericl merged commit 44db41c into ray-project:master Feb 3, 2022

jjyao deleted the jjyao/locality branch February 3, 2022 20:47

jjyao restored the jjyao/locality branch February 4, 2022 16:22

jjyao deleted the jjyao/locality branch February 4, 2022 17:55

jjyao removed the do-not-merge Do not merge this PR! label Feb 22, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

No spreading if a node is selected for lease request due to locality #22015

No spreading if a node is selected for lease request due to locality #22015

jjyao commented Jan 31, 2022 •

edited

Loading

jjyao commented Jan 31, 2022

ericl Jan 31, 2022

jjyao Feb 1, 2022

ericl left a comment

jjyao commented Feb 1, 2022

stephanie-wang commented Feb 1, 2022

jjyao commented Feb 2, 2022

ericl Feb 2, 2022

jjyao Feb 2, 2022

jjyao commented Feb 2, 2022 •

edited

Loading

jjyao commented Feb 2, 2022

clarkzinzow commented Feb 3, 2022

clarkzinzow left a comment

ericl commented Feb 3, 2022

jjyao commented Feb 3, 2022

No spreading if a node is selected for lease request due to locality #22015

No spreading if a node is selected for lease request due to locality #22015

Conversation

jjyao commented Jan 31, 2022 • edited Loading

Why are these changes needed?

Related issue number

Checks

jjyao commented Jan 31, 2022

ericl Jan 31, 2022

Choose a reason for hiding this comment

jjyao Feb 1, 2022

Choose a reason for hiding this comment

ericl left a comment

Choose a reason for hiding this comment

jjyao commented Feb 1, 2022

stephanie-wang commented Feb 1, 2022

jjyao commented Feb 2, 2022

ericl Feb 2, 2022

Choose a reason for hiding this comment

jjyao Feb 2, 2022

Choose a reason for hiding this comment

jjyao commented Feb 2, 2022 • edited Loading

jjyao commented Feb 2, 2022

clarkzinzow commented Feb 3, 2022

clarkzinzow left a comment

Choose a reason for hiding this comment

ericl commented Feb 3, 2022

jjyao commented Feb 3, 2022

jjyao commented Jan 31, 2022 •

edited

Loading

jjyao commented Feb 2, 2022 •

edited

Loading