Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[runtime env] Add plugin name to internal URI format and add GC for py_modules #20009

Merged
merged 9 commits into from
Nov 4, 2021

Conversation

architkulkarni
Copy link
Contributor

@architkulkarni architkulkarni commented Nov 3, 2021

Why are these changes needed?

  • For any URIs used in runtime envs, internally (only in the C++ layer) convert URIs to the format plugin|protocol://pkg_name . We need the plugin name in the URI because the runtime env agent needs to know which plugin to call the delete method of just based on the URI.
  • Add GC for py_modules

Related issue number

Closes #19775

Checks

  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested :(

@architkulkarni
Copy link
Contributor Author

This is ready for review, test_runtime_env_working_dir is passing locally

Copy link
Contributor

@edoakes edoakes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks pretty good, but I would prefer if we can keep this protocol | URI thing purely an internal implementation detail and minimally scoped to avoid issues like the job submission needing to worry about it.

Based on my understanding, we only need this in order to tell the agent which plugin it should call delete on when GC happens, right? If that's the case, what if we just do the following:

  • Prepend plugin_name| to the URIs when we call get_uris to pass it into C++-land.
  • In the agent, when DeleteURI is called, parse out the plugin_name and URI before calling into a specific plugin for deletion.

This should isolate this to just those two specific places instead of needing to inject this logic all over the place. What do you think?

@architkulkarni
Copy link
Contributor Author

Nice, that's better! I'll update this PR.

@architkulkarni architkulkarni added the @author-action-required The PR author is responsible for the next step. Remove tag to send back to the reviewer. label Nov 3, 2021
Copy link
Contributor

@edoakes edoakes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@architkulkarni this is looking way better. I'm a little disconcerted that we have to do the string parsing in C++, but it may be ok.

Is it possible that we just make this a more structured format in the protobuf instead of just a list of strings? We could have the protobuf be:

message InternalURI {
  string protocol = 1;
  string uri = 2;
}

Then the GCS could just look at protocol and we could avoid the whole parsing thing. Would that work? Sorry for not thinking of it sooner, it might be a cleaner way to do this... if this is a big change I'm ok with the status quo

Comment on lines -24 to -26
if (unused_uris_.count(uri)) {
unused_uris_.erase(uri);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's the context for this change?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I can tell it wasn't doing anything, it was just getting populated and deleted but was never read anywhere else. It might have been leftover from some earlier implementation

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

gotcha

Comment on lines +403 to +413
runtime_env_manager_ = std::make_unique<RuntimeEnvManager>(
/*deleter=*/[this](const std::string &plugin_uri, auto cb) {
// A valid runtime env URI is of the form "plugin|protocol://hash".
std::string plugin_sep = "|";
std::string protocol_sep = "://";
auto plugin_end_pos = plugin_uri.find(plugin_sep);
auto protocol_end_pos = plugin_uri.find(protocol_sep);
if (protocol_end_pos == std::string::npos ||
plugin_end_pos == std::string::npos) {
RAY_LOG(ERROR) << "Plugin URI must be of form "
<< "<plugin>|<protocol>://<hash>, got " << plugin_uri;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, ouch, I didn't realize we'd need to parse this in C++

@architkulkarni
Copy link
Contributor Author

@architkulkarni this is looking way better. I'm a little disconcerted that we have to do the string parsing in C++, but it may be ok.

Is it possible that we just make this a more structured format in the protobuf instead of just a list of strings? We could have the protobuf be:

message InternalURI {
  string protocol = 1;
  string uri = 2;
}

Then the GCS could just look at protocol and we could avoid the whole parsing thing. Would that work? Sorry for not thinking of it sooner, it might be a cleaner way to do this... if this is a big change I'm ok with the status quo

@edoakes no worries, I also should have thought of this. This sounds good, you mean plugin instead of protocol right?

It's a conceptually simple change but I think it'll require a lot of updates across the Cython and C++, so how about we do it in a followup PR?

@edoakes
Copy link
Contributor

edoakes commented Nov 3, 2021

@architkulkarni I'm fine w/ that

@edoakes
Copy link
Contributor

edoakes commented Nov 3, 2021

and yes meant plugin not protocol

@edoakes edoakes merged commit bcb6396 into ray-project:master Nov 4, 2021
@architkulkarni architkulkarni deleted the add-gc-py-modules branch November 4, 2021 17:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
@author-action-required The PR author is responsible for the next step. Remove tag to send back to the reviewer.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[runtime_env] Implement GC for py_modules URIs
3 participants