Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[xray] Hide "append-only log" semantics in global state API. #2852

Closed
robertnishihara opened this issue Sep 10, 2018 · 7 comments · Fixed by #3161
Closed

[xray] Hide "append-only log" semantics in global state API. #2852

robertnishihara opened this issue Sep 10, 2018 · 7 comments · Fixed by #3161
Assignees

Comments

@robertnishihara
Copy link
Collaborator

Certain global state commands expose unnecessary implementation details.

  • ray.global_state.client_table() returns a log, which can contain multiple entries for the same "client". This came up in [tune] Trial executor crashes on node removal in xray #2851.
  • ray.global_state.object_table() returns a list of entries for each object ID, we should probably just have one entry per object ID.
  • ray.global_state.task_table()returns a list of entries for each task ID, which should be a single entry.
@zhijunfu
Copy link
Contributor

For client table, it makes sense to have a single entry for each client.

For object table, we might want to have all "current" clients for this object, so that people can know where the object is stored.

For task table, it might be helpful to provide an option to return a list of entries, so that developers can leverage this information for debugging purpose, e.g. investigate failure & reconstruction for a task.

Thoughts?

@robertnishihara
Copy link
Collaborator Author

For the object table, we could have one entry per object, and that entry could include a list of clients or of creation/eviction events.

For the task table, we actually don't store a log in the GCS, updates to the task table overwrite the current entry. This is the case because we use Table instead of Log in

ray/src/ray/gcs/tables.cc

Lines 440 to 451 in 588c573

template class Log<ObjectID, ObjectTableData>;
template class Log<TaskID, ray::protocol::Task>;
template class Table<TaskID, ray::protocol::Task>;
template class Table<TaskID, TaskTableData>;
template class Log<ActorID, ActorTableData>;
template class Log<TaskID, TaskReconstructionData>;
template class Table<TaskID, TaskLeaseData>;
template class Table<ClientID, HeartbeatTableData>;
template class Log<JobID, ErrorTableData>;
template class Log<UniqueID, ClientTableData>;
template class Log<JobID, DriverTableData>;
template class Log<UniqueID, ProfileTableData>;

@guoyuhong
Copy link
Contributor

I agree that Client table should have one entry per client. The bug that I'm trying to fix is caused by the multi-entry for one client.

@guoyuhong
Copy link
Contributor

Maybe we can create a new class named UpdatableLog? Then the ObjectTable and ClientTable could both use this UpdatableLog and keep one entry?

@robertnishihara
Copy link
Collaborator Author

Ok, I understand now. The Table is already an UpdatableLog.

However, even with an UpdatableLog, we could still have a race condition where the node manager tries to connect to another dead node manager.

@guoyuhong
Copy link
Contributor

Table is a single entry structure, right? One Redis key will only have one DataT element, so we can update it as a whole. Log is different, it uses a ZSET to hold several DataT elements. For ClientTable there should be multiple entries for different clients. If we can update a Log entry, then ClientAdded function may not get an entry of a dead raylet with is_insertion=true?

@robertnishihara
Copy link
Collaborator Author

The Table data structure is a single entry per key. The Log is a list/zset per key. Currently the ClientTable only uses a single key.

I think it's good to keep the whole record of the nodes that joined and left the cluster around, since that information may be useful for debugging and other reasons.

Even if we use a Table so that we can update the ClientTable entry in place, it is still possible that a node manager could try to connect to a remote node manager that has died. E.g.

  1. Node manager 1 registers with the GCS.
  2. Node manager 2 registers with the GCS.
  3. ClientAdded(node_manager_1) is called on node manager 2, node manager 1 simultaneously dies.
  4. As part of ClientAdded, node manager 2 tries to connect to node manager 1 and fails.
  5. Later on, the monitor detects that node manager 1 has died and updates the GCS.

So we still need to handle the case where a node manager tries to connect to a dead node manager.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants