Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Let actors use GPUs. #302

Merged
merged 6 commits into from
Feb 21, 2017
Merged

Let actors use GPUs. #302

merged 6 commits into from
Feb 21, 2017

Conversation

robertnishihara
Copy link
Collaborator

This is a quick hack to allow actors to use GPUs so we can figure out what the right API here is.

In this PR:

  • When creating an actor, the decorator can take a number of GPUs (and CPUs, although that field is currently not used for anything), e.g., via ray.actor(num_gpus=2).
  • Actor methods can call ray.get_gpu_ids() to get a list of the GPU IDs that the actor is allowed to use. The idea is that this could be used in the constructor of an actor (which defines a neural net) via something like
os.environ["CUDA_VISIBLE_DEVICES"] = ",".join([str(i) for i in ray.get_gpu_ids()])
  • GPU management here is done entirely outside of the local scheduler. The worker or driver that creates the actor will interact with Redis to see how many available GPUs each local scheduler has, will take some of them for the newly created actor, and will mark them as used. If there aren't enough available GPUs, it will raise an exception. TODO: this needs to be unified with how resource management is done in the rest of the system.
  • GPUs are never freed once an actor consumes them. More generally, actors are never garbage collected. TODO: this needs to change.

@AmplabJenkins
Copy link

Merged build finished. Test PASSed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/56/
Test PASSed.

@AmplabJenkins
Copy link

Merged build finished. Test PASSed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/58/
Test PASSed.


def get_local_schedulers():
def get_local_schedulers(worker):
local_schedulers = []
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is duplicated

@AmplabJenkins
Copy link

Merged build finished. Test PASSed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/60/
Test PASSed.

@AmplabJenkins
Copy link

Merged build finished. Test PASSed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/61/
Test PASSed.

@AmplabJenkins
Copy link

Merged build finished. Test PASSed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/62/
Test PASSed.

@pcmoritz pcmoritz merged commit e399f57 into ray-project:master Feb 21, 2017
@pcmoritz pcmoritz deleted the actorgpu branch February 21, 2017 09:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants