Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[scheduler][6] Integrate ray with syncer. #23660

Merged
merged 85 commits into from
May 10, 2022

Conversation

fishbone
Copy link
Contributor

@fishbone fishbone commented Apr 1, 2022

Why are these changes needed?

The new syncer comes with the feature of long-polling and versioning. This PR integrates it with ray.

Related issue number

Checks

  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested :(

@fishbone fishbone marked this pull request as ready for review April 4, 2022 04:40
@fishbone fishbone changed the title Syncer integration [scheduler][6] Integrate ray with syncer. Apr 4, 2022
@fishbone fishbone changed the title [scheduler][6] Integrate ray with syncer. [wip][scheduler][6] Integrate ray with syncer. Apr 4, 2022
@fishbone
Copy link
Contributor Author

fishbone commented Apr 5, 2022

Previous prototype: #22073

@wuisawesome wuisawesome removed their request for review April 6, 2022 07:08
@rkooo567 rkooo567 self-assigned this Apr 6, 2022
@fishbone fishbone removed the @author-action-required The PR author is responsible for the next step. Remove tag to send back to the reviewer. label Apr 28, 2022
@fishbone fishbone added the @author-action-required The PR author is responsible for the next step. Remove tag to send back to the reviewer. label Apr 28, 2022
@fishbone fishbone removed the @author-action-required The PR author is responsible for the next step. Remove tag to send back to the reviewer. label Apr 29, 2022
@fishbone
Copy link
Contributor Author

I offline synced with @WangTaoTheTonic to fill some gaps here. There are some issues with the resource data which require more effort to do and do not fit into the scope of syncer PR. @WangTaoTheTonic is willing to offer some help on that.

Finally this PR is ready for another round of review. @scv119 @rkooo567 @jjyao @WangTaoTheTonic please give it another look.

@scv119
Copy link
Contributor

scv119 commented May 4, 2022

Will take a look this weekend!

Copy link
Collaborator

@jjyao jjyao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lg. if we decide to change component to message type, let's make sure we change it everywhere.

src/ray/common/ray_syncer/ray_syncer-inl.h Outdated Show resolved Hide resolved
@@ -45,7 +45,7 @@ class NodeState {
/// \param cid The component id to take the snapshot.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This also needs to be updated.

rpc::ResourcesData resources;
resources.ParseFromString(message->sync_message());
resources.set_node_id(message->node_id());
RAY_CHECK(message->message_type() == syncer::MessageType::RESOURCE_MANAGER);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should message type be resource_report or resource_view?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about this one?

src/ray/raylet/scheduling/local_resource_manager.h Outdated Show resolved Hide resolved
src/ray/raylet/scheduling/local_resource_manager.h Outdated Show resolved Hide resolved
Copy link
Contributor

@scv119 scv119 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM , need to resolve the merge conflict and address @jjyao 's comments

@scv119 scv119 added the @author-action-required The PR author is responsible for the next step. Remove tag to send back to the reviewer. label May 9, 2022
src/ray/common/ray_syncer/ray_syncer.h Outdated Show resolved Hide resolved
RAY_CHECK(message_type == syncer::MessageType::RESOURCE_MANAGER);
// We check the memory inside version, so version is not a const function.
// Ideally, we need to move the memory check somewhere else.
// TODO(iycheng): Make version as a const function.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this needs to be updated as well.

rpc::ResourcesData resources;
resources.ParseFromString(message->sync_message());
resources.set_node_id(message->node_id());
RAY_CHECK(message->message_type() == syncer::MessageType::RESOURCE_MANAGER);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about this one?

@fishbone fishbone merged commit 6c60dbb into ray-project:master May 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
@author-action-required The PR author is responsible for the next step. Remove tag to send back to the reviewer.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants