Clustered execution interval #60

ari · 2018-05-05T12:21:22Z

If I'm reading the code correctly, Zookeeper is used for locking (preventing two executions at once) but not for synchronising the execution timing.

So if I had a cluster of 5 applications, with a scheduled event once an hour, each application will try to run it once an hour and we'd get executions every 12 minutes on average.

A possible solution might be to store the last run timestamp in the ZK node, and make persist that node between executions.

andrus · 2018-05-05T12:29:04Z

You are right that ZK is used here for locking only, not for centralized scheduling. Unless the clocks are not in sync between the cluster nodes or the jobs finish really quickly, "losing" nodes would simply abandon the run till the next scheduled event if they fail to obtain a lock within a short timeout. So in practice the job will still run once an hour.

Having said that, the current situation is not ideal and we are looking at alt architectures, one being a single centralized scheduler and job dispatching done via an event queue to the clustered "agent" nodes.

ari · 2018-05-05T12:43:48Z

If the jobs are on a fixedDelay rather than cron, then when they run will depend on when each app is started. But yes, a cron approach with good clock sync should be better in this case. I hadn't realised till now that this project supported that too.

A centralised scheduler adds a new single point of failure though. So using zookeeper as a single shared lock and timestamp might be a simpler solution, no?

andrus · 2018-05-05T12:56:14Z

Yeah, ZK can be used either for leader election for the scheduler (to avoid a single point of failure) or as an execution tracking mechanism (your timestamp suggestion ... I guess it will require a bit of fuzziness when a job decides whether to run or not).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Clustered execution interval #60

Clustered execution interval #60

ari commented May 5, 2018

andrus commented May 5, 2018

ari commented May 5, 2018

andrus commented May 5, 2018

Clustered execution interval #60

Clustered execution interval #60

Comments

ari commented May 5, 2018

andrus commented May 5, 2018

ari commented May 5, 2018

andrus commented May 5, 2018