Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prototype named actors. #2129

Merged
merged 6 commits into from
May 24, 2018
Merged

Prototype named actors. #2129

merged 6 commits into from
May 24, 2018

Conversation

heyucongtom
Copy link
Contributor

@heyucongtom heyucongtom commented May 23, 2018

What do these changes do?

Since it's a total re-factorization, I feel like it may be cleaner to open up a new commit tree for the modification. It refer to the same issue as discussed in the pull-request #2120. (Please feel free to close that one.)

This commit deal with:

  1. Implement the get_actor and register_actor inside the experimental directory.
  2. Add unit-test for single driver environment.
  3. Raise appropriate errors for the operation.

Related issue number

#1424 #2120

@robertnishihara robertnishihara mentioned this pull request May 23, 2018
Copy link
Collaborator

@robertnishihara robertnishihara left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great! Please make sure the linting passes.

# Test getting f
f2 = ray.experimental.get_actor("f1")
self.assertEqual(f1._actor_id, f2._actor_id)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's modify the actor to be something like

def method(self):
    self.x += 1
    return self.x

so that here we can do

self.assertEqual(ray.get(f1.method.remote(), 1)
self.assertEqual(ray.get(f2.method.remote(), 2)
self.assertEqual(ray.get(f1.method.remote(), 3)
self.assertEqual(ray.get(f2.method.remote(), 4)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Handled.


# Test getting an unexist actor
with self.assertRaises(AssertionError):
err = ray.experimental.get_actor("unexisted")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unexisted -> nonexistent

@@ -1906,6 +1906,36 @@ def method(self):
# we should also test this from a different driver.
ray.get(new_f.method.remote())

def testRegisterAndGetActorHandle(self):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we change this to testRegisterAndGetNamedActors? It'll be easier to find by searching

def _calculate_key_(name):
return b"Actor:" + str.encode(name)

def get_actor(name):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add docstrings for these methods with the format used e.g., in

ray/python/ray/worker.py

Lines 2443 to 2458 in 4584193

def get(object_ids, worker=global_worker):
"""Get a remote object or a list of remote objects from the object store.
This method blocks until the object corresponding to the object ID is
available in the local object store. If this object is not in the local
object store, it will be shipped from an object store that has it (once the
object has been created). If object_ids is a list, then the objects
corresponding to each object in the list will be returned.
Args:
object_ids: Object ID of the object to get or a list of object IDs to
get.
Returns:
A Python object or a list of Python objects.
"""

assert not is_existed, \
"Error: the actor with name={} already exists".format(name)
pickled_state = pickle.dumps(actor_handle)
worker.redis_client.hmset(actor_hash, {name: pickled_state})
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a race condition with this approach. It's possible that another worker could add publish a named actor with the same name between the calls to hexists and hmset. You can avoid this with hsetnx. See https://redis.io/commands/hsetnx.

from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's reserve assertions for checking things that should never actually happen. The errors (in both register_actor and get_actor) can happen if the user passes in the wrong values, so we should raise some sort of exception

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Handled.

def register_actor(name, actor_handle):
worker = ray.worker.get_global_worker()
actor_hash = _calculate_key_(name)
assert type(actor_handle) == ray.actor.ActorHandle, \
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should raise TypeError

assert type(actor_handle) == ray.actor.ActorHandle, \
"Error: you could only store named-actors."
is_existed = worker.redis_client.hexists(actor_hash, name)
assert not is_existed, \
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should raise ValueError

def get_actor(name):
worker = ray.worker.get_global_worker()
actor_hash = _calculate_key_(name)
pickled_state = worker.redis_client.hmget(actor_hash, name)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can just use hget and then remove the assertion below.

pickled_state = worker.redis_client.hmget(actor_hash, name)
assert len(pickled_state) == 1, \
"Error: Multiple actors under this name."
assert pickled_state[0] is not None, \
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should raise ValueError

@robertnishihara robertnishihara changed the title New commit tree for the named actor Prototype named actors. May 23, 2018
@robertnishihara
Copy link
Collaborator

To do linting, you can do (from ray/)

yapf --style .style.yapf -i  python/ray/experimental/__init__.py  python/ray/experimental/named_actors.py test/actor_test.py

also

flake8 python/ray/experimental/__init__.py  python/ray/experimental/named_actors.py test/actor_test.py

@@ -7,10 +7,13 @@
flush_redis_unsafe, flush_task_and_object_metadata_unsafe,
flush_finished_tasks_unsafe, flush_evicted_objects_unsafe,
_flush_finished_tasks_unsafe_shard, _flush_evicted_objects_unsafe_shard)
from .named_actors import (
get_actor, register_actor
)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like this can all fit on one line without the parentheses

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/5597/
Test FAILed.

@@ -869,7 +869,8 @@ def _serialization_helper(self, ray_forking):
_ray_actor_creation_dummy_object_id.id(),
"actor_method_cpus": self._ray_actor_method_cpus,
"actor_driver_id": self._ray_actor_driver_id.id(),
"previous_actor_handle_id": self._ray_actor_handle_id.id(),
"previous_actor_handle_id": self._ray_actor_handle_id.id()
if self._ray_actor_handle_id else None,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this change needed?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm.. ok I see that the test failures without it.

Copy link
Collaborator

@robertnishihara robertnishihara left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I pushed some stylistic changes. I'll merge this once the tests pass.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/5602/
Test PASSed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/5606/
Test PASSed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/5609/
Test PASSed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/5610/
Test PASSed.

@robertnishihara robertnishihara merged commit 3509a33 into ray-project:master May 24, 2018
@robertnishihara
Copy link
Collaborator

One thing I noticed is that calling get_actor twice on the same name will cause problems (once you try to use the actor). This is probably the same issue as #2115.

@heyucongtom
Copy link
Contributor Author

heyucongtom commented May 24, 2018 via email

alok added a commit to alok/ray that referenced this pull request May 25, 2018
* master:
  Prototype named actors. (ray-project#2129)
  Update arrow to latest master (ray-project#2100)
  [DataFrame] Speed up dtypes (ray-project#2118)
  do not fetch from dead Plasma Manager (ray-project#2116)
  [DataFrame] Refactor GroupBy Methods and Implement Reindex (ray-project#2101)
  Initial Support for Airspeed Velocity (ray-project#2113)
  Use automatic memory management in Redis modules. (ray-project#1797)
  [DataFrame] Test bugfixes (ray-project#2111)
  [DataFrame] Update initializations of IndexMetadata which use outdated APIs (ray-project#2103)
alok added a commit to alok/ray that referenced this pull request May 30, 2018
* fix-a3c-torch:
  Prototype named actors. (ray-project#2129)
  Update arrow to latest master (ray-project#2100)
  [DataFrame] Speed up dtypes (ray-project#2118)
  do not fetch from dead Plasma Manager (ray-project#2116)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants