Skip to content
This repository has been archived by the owner on Mar 31, 2023. It is now read-only.

High availabilty support #103

Open
vditya opened this issue Oct 20, 2016 · 6 comments
Open

High availabilty support #103

vditya opened this issue Oct 20, 2016 · 6 comments

Comments

@vditya
Copy link

vditya commented Oct 20, 2016

If one builds a framework on top of Fenzo, what are the guidelines to enable high availability of the framework? Specifically assuming zookeeper is being used to provide leadership election between framework instances, how local cache of the Fenzo (for example Tasks queues, running state etc.) can be synchronized to other instances of framework built on Fenzo?

Thanks for your help.

@spodila
Copy link
Contributor

spodila commented Oct 25, 2016

@vditya Since Fenzo is a library used with in the framework, Fenzo would not need an explicit mechanism to enable high availability on its own. I would think the framework would be defining this since it would already need a state persistence store. There are two ways, for example, one could achieve this in the framework.

One, the elected leader is the only one that modifies state in an external persistent store (e.g., Cassandra). Such modifications include the state that is passed into Fenzo (assign and unassign task calls). Upon new leader election, the framework would initialize the state from the persistent store and call Fenzo to assign all tasks that were running before. Then, the framework would call Fenzo's TaskScheduler.scheduleOnce(...) to assign resources to new pending tasks. We currently use this technique with Cassandra as our persistence store.

Two, all instances of the framework perform log replication across each other. Each framework calls Fenzo to update in memory state (assign/unassign tasks) as it updates the replicated events. Only the elected leader calls Fenzo to assign resources to tasks, TaskScheduler.scheduleOnce(). One advantage of this can be that there is no need to initialize state upon leader election.

@vditya
Copy link
Author

vditya commented Oct 26, 2016

Thanks for your reply. How about accounting of previous offers received from Mesos? I would assume on new leader election you would update that back from same persistent storage, correct? On separate topic, are you following to support Mesos 1.1 taskGroup to introduce Pod concept? Thanks

@spodila
Copy link
Contributor

spodila commented Oct 27, 2016

Upon (re)registration of mesos driver, you should invalidate all previous offers in Fenzo by calling TaskScheduler.expireAllLeases().

We don't store offers into persistent store. I can't think of a reason to do that. For example, upon restart of your framework process, you would connect to mesos again, and start getting new offers. The old offers are invalid when your framework disconnects from Mesos.

Although we don't have a specific timeline to use Mesos 1.1 taskGroup, I'd think that Fenzo is unaffected by it. Fenzo would continue to do allocation by treating the entire group as "one entity" to assign resources to. So, individual tasks of the task group would all be aggregated and provided as one TaskRequest object to Fenzo. Your current implementation of Fenzo's TaskRequest interface will "hide" the individual tasks of the group within it. Let me know if you can think of a reason this won't work for you.

@spodila
Copy link
Contributor

spodila commented Nov 16, 2016

@vditya do you need any further help on this?

@vditya
Copy link
Author

vditya commented Nov 16, 2016

@spodila Sorry took long to reply. Thanks for your help, we at Nvidia is now evaluating Fenzo for solving our scheduler need. I am sure we will more questions/feedback and contributions going forward.

@spodila
Copy link
Contributor

spodila commented Nov 16, 2016

@vditya no problem, that sounds great. I am glad to provide any additional information/help and would love to learn your use cases more and to get contributions. If it helps, we can schedule a call.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants