Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dedup ScheduleStream by jobName. #689

Merged
merged 1 commit into from
Jun 21, 2016
Merged

Conversation

debugger87
Copy link
Contributor

@debugger87 debugger87 commented Jun 13, 2016

chronos with only one server instance will be switched to Defeated status if it's disconnected to zookeeper and be switched to Elected if it's connected to zookeeper. In this case, chronos load job from zookeeper and add ScheduleStream into streams in JobScheduler. If we don't dedup ScheduleStream by jobName, the size of streams will increase unexpetedly as follows:

[INFO] [pool-5-thread-1] 06-10 00:01:00,468 [JobScheduler] - Size of streams: 150
......
[INFO] [pool-5-thread-1] 06-10 19:01:46,113 [JobScheduler] - Defeated. Not the current leader.
[INFO] [main-EventThread] 06-10 19:01:59,900 [ConnectionStateManager] - State change: RECONNECTED
[INFO] [pool-5-thread-1] 06-10 19:01:59,905 [JobScheduler] - Elected as leader.
....
[INFO] [pool-5-thread-1] 06-10 19:02:35,965 [JobScheduler] - Size of streams: 231

Obviously, there are many ScheduleStreams which are duplicated in streams. If some jobs' tasks fail, chronos' JobScheduler may run into unexpected status like dead lock.

@debugger87
Copy link
Contributor Author

debugger87 commented Jun 14, 2016

@brndnmtthws could you help me to review this PR?

@brndnmtthws
Copy link
Member

Interesting. Thanks for the PR. Can you describe how you tested it?

@debugger87
Copy link
Contributor Author

debugger87 commented Jun 20, 2016

@brndnmtthws We can just launch only one server instance of chronos, shut down and restart zookeeper in some time (make chronos disconnect to zookeeper and re-elected as leader). In this case, streams in JobScheduler will append duplicated ScheduleStreams.

@brndnmtthws brndnmtthws merged commit ce4469d into mesos:master Jun 21, 2016
gkleiman pushed a commit that referenced this pull request Apr 26, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants