orchestrator: Flag tasks that shouldn't be restarted #2327

aaronlehmann · 2017-07-21T22:51:29Z

Previously, restart conditions other than "OnAny" were honored on a
best-effort basis. A service-level reconciliation, for example after a
leader election, would see that not enough tasks were running, and start
replacement tasks regardless of the restart policy. This limited the
usefulness of the other restart conditions.

This change adds a DontRestart flag to Task. It can be set by the
restart supervisor when it shuts down a task and decides not to start a
replacement task. The orchestrators look for the presence of this flag
and honor it when doing service-level reconciliation. If the flag is
set, the dead task is passed to the updater along with the running
tasks, so the updater can start a replacement if and only if the service
definition has changed relative to the dead task.

The task reaper has been modified so it will never delete the last task
in a slot, if that task has the DontRestart flag set.

Fixes #932
Counterproposal to #2290, informed by corner cases encountered there

cc @cyli @aluzzardi

codecov · 2017-07-21T23:02:27Z

Codecov Report

Merging #2327 into master will decrease coverage by 0.06%.
The diff coverage is 9.43%.

@@            Coverage Diff             @@
##           master    #2327      +/-   ##
==========================================
- Coverage   60.31%   60.24%   -0.07%     
==========================================
  Files         128      128              
  Lines       26002    26032      +30     
==========================================
  Hits        15683    15683              
- Misses       8929     8959      +30     
  Partials     1390     1390

diogomonica · 2017-07-24T17:58:19Z

High-level seems like a decent solution. Following #2290 I don't see a better alternative, but I'm also no longer super familiar with this portion of the codebase.

cyli · 2017-07-24T19:20:41Z

manager/orchestrator/update/updater.go

 		}

-		service.UpdateStatus.State = api.UpdateStatus_ROLLBACK_STARTED
-		service.UpdateStatus.Message = message
+		err = batch.Update(func(tx store.Tx) error {


Non-blocking nitpick: maybe just return batch.Update here?

cyli

As someone not very familiar with the intricacies of the orchestrator, reaper, updater, etc., this solution makes more sense to me and is easier to understand.

It seems fine to me to rely on the task history, for --task-history-limit 0 to be a best effort sort of thing (maybe if that's specified, we can just hide the task history in the API but actually still keep it around)?

Previously, restart conditions other than "OnAny" were honored on a best-effort basis. A service-level reconciliation, for example after a leader election, would see that not enough tasks were running, and start replacement tasks regardless of the restart policy. This limited the usefulness of the other restart conditions. This change adds a DontRestart flag to Task. It can be set by the restart supervisor when it shuts down a task and decides not to start a replacement task. The orchestrators look for the presence of this flag and honor it when doing service-level reconciliation. If the flag is set, the dead task is passed to the updater along with the running tasks, so the updater can start a replacement if and only if the service definition has changed relative to the dead task. The task reaper has been modified so it will never delete the last task in a slot, if that task has the DontRestart flag set. Signed-off-by: Aaron Lehmann <[email protected]>

aaronlehmann · 2017-07-27T20:59:30Z

Superseded by #2332

cyli reviewed Jul 24, 2017

View reviewed changes

cyli approved these changes Jul 24, 2017

View reviewed changes

aaronlehmann force-pushed the flag-no-restart branch from 2d279a2 to 43bbe90 Compare July 24, 2017 21:06

This was referenced Jul 24, 2017

[WIP] Respect restart policy during service reconciliation #2290

Closed

orchestrator: Use task history to evaluate restart policy #2332

Merged

aaronlehmann closed this Jul 27, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

orchestrator: Flag tasks that shouldn't be restarted #2327

orchestrator: Flag tasks that shouldn't be restarted #2327

aaronlehmann commented Jul 21, 2017

codecov bot commented Jul 21, 2017 •

edited

Loading

diogomonica commented Jul 24, 2017

cyli Jul 24, 2017

cyli left a comment

aaronlehmann commented Jul 27, 2017

orchestrator: Flag tasks that shouldn't be restarted #2327

orchestrator: Flag tasks that shouldn't be restarted #2327

Conversation

aaronlehmann commented Jul 21, 2017

codecov bot commented Jul 21, 2017 • edited Loading

Codecov Report

diogomonica commented Jul 24, 2017

cyli Jul 24, 2017

Choose a reason for hiding this comment

cyli left a comment

Choose a reason for hiding this comment

aaronlehmann commented Jul 27, 2017

codecov bot commented Jul 21, 2017 •

edited

Loading