Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

apps: Fix dc triggers reconciliation on image change and do not deploy DCs with empty image #17539

Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
113 changes: 71 additions & 42 deletions pkg/apps/controller/deploymentconfig/deploymentconfig_controller.go
Original file line number Diff line number Diff line change
Expand Up @@ -87,7 +87,7 @@ func (c *DeploymentConfigController) Handle(config *appsapi.DeploymentConfig) er
glog.V(5).Infof("Reconciling %s/%s", config.Namespace, config.Name)
// There's nothing to reconcile until the version is nonzero.
if appsutil.IsInitialDeployment(config) && !appsutil.HasTrigger(config) {
return c.updateStatus(config, []*v1.ReplicationController{})
return c.updateStatus(config, []*v1.ReplicationController{}, true)
}

// List all ReplicationControllers to find also those we own but that no longer match our selector.
Expand Down Expand Up @@ -118,7 +118,18 @@ func (c *DeploymentConfigController) Handle(config *appsapi.DeploymentConfig) er
// the latest available information. Some deletions make take some time to complete so there
// is value in doing this.
if config.DeletionTimestamp != nil {
return c.updateStatus(config, existingDeployments)
return c.updateStatus(config, existingDeployments, true)
}

// If the config is paused we shouldn't create new deployments for it.
if config.Spec.Paused {
// in order for revision history limit cleanup to work for paused
// deployments, we need to trigger it here
if err := c.cleanupOldDeployments(existingDeployments, config); err != nil {
c.recorder.Eventf(config, v1.EventTypeWarning, "DeploymentCleanupFailed", "Couldn't clean up deployments: %v", err)
}

return c.updateStatus(config, existingDeployments, true)
}

latestExists, latestDeployment := appsutil.LatestDeploymentInfo(config, existingDeployments)
Expand All @@ -128,38 +139,52 @@ func (c *DeploymentConfigController) Handle(config *appsapi.DeploymentConfig) er
return err
}
}

// Never deploy with invalid or unresolved images
for i, container := range config.Spec.Template.Spec.Containers {
if len(strings.TrimSpace(container.Image)) == 0 {
glog.V(4).Infof("Postponing rollout #%d for DeploymentConfig %s/%s because of invalid or unresolved image for container #%d (name=%s)", config.Status.LatestVersion, config.Namespace, config.Name, i, container.Name)
return c.updateStatus(config, existingDeployments, true)
}
}

configCopy := config.DeepCopy()
// Process triggers and start an initial rollouts
shouldTrigger, shouldSkip := triggerActivated(configCopy, latestExists, latestDeployment, c.codec)
if !shouldSkip && shouldTrigger {
configCopy.Status.LatestVersion++
return c.updateStatus(configCopy, existingDeployments)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mfojtik I wonder how this ever worked

if reflect.DeepEqual(newStatus, config.Status) {
return nil
}

it doesn't see latestVersion change (made prior to calculating status) and if there is no other change the update gets ignored

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

¯_(ツ)_/¯

shouldTrigger, shouldSkip, err := triggerActivated(configCopy, latestExists, latestDeployment, c.codec)
if err != nil {
return fmt.Errorf("triggerActivated failed: %v", err)
}
// Have to wait for the image trigger to get the image before proceeding.
if shouldSkip && appsutil.IsInitialDeployment(config) {
return c.updateStatus(configCopy, existingDeployments)

if shouldSkip {
return c.updateStatus(configCopy, existingDeployments, true)
}

if shouldTrigger {
configCopy.Status.LatestVersion++
_, err := c.dn.DeploymentConfigs(configCopy.Namespace).UpdateStatus(configCopy)
return err
}

// If the latest deployment already exists, reconcile existing deployments
// and return early.
if latestExists {
// If the latest deployment is still running, try again later. We don't
// want to compete with the deployer.
if !appsutil.IsTerminatedDeployment(latestDeployment) {
return c.updateStatus(config, existingDeployments)
return c.updateStatus(config, existingDeployments, false)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this the only case when we don't update the observed generation? maybe it will be nicer to have two funcs: updateStatusAndGeneration and updateStatusWithoutGeneration... (i really don't like the boolean params as one have to always search what are they for ;-)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it's worth it to split 1 func into 3 (1 internal) just because of the bool; I don't feel strong about it though

}

return c.reconcileDeployments(existingDeployments, config, cm)
}
// If the config is paused we shouldn't create new deployments for it.
if config.Spec.Paused {
// in order for revision history limit cleanup to work for paused
// deployments, we need to trigger it here
if err := c.cleanupOldDeployments(existingDeployments, config); err != nil {
c.recorder.Eventf(config, v1.EventTypeWarning, "DeploymentCleanupFailed", "Couldn't clean up deployments: %v", err)
}

return c.updateStatus(config, existingDeployments)
// Never deploy with invalid or unresolved images
for i, container := range config.Spec.Template.Spec.Containers {
if len(strings.TrimSpace(container.Image)) == 0 {
glog.V(4).Infof("Postponing rollout #%d for DeploymentConfig %s/%s because of invalid or unresolved image for container #%d (name=%s)", config.Status.LatestVersion, config.Namespace, config.Name, i, container.Name)
return c.updateStatus(config, existingDeployments, true)
}
}

// No deployments are running and the latest deployment doesn't exist, so
// create the new deployment.
deployment, err := appsutil.MakeDeploymentV1(config, c.codec)
Expand All @@ -182,17 +207,17 @@ func (c *DeploymentConfigController) Handle(config *appsapi.DeploymentConfig) er
if isOurs {
// If the deployment was already created, just move on. The cache could be
// stale, or another process could have already handled this update.
return c.updateStatus(config, existingDeployments)
return c.updateStatus(config, existingDeployments, true)
} else {
err = fmt.Errorf("replication controller %s already exists and deployment config is not allowed to claim it.", deployment.Name)
err = fmt.Errorf("replication controller %s already exists and deployment config is not allowed to claim it", deployment.Name)
c.recorder.Eventf(config, v1.EventTypeWarning, "DeploymentCreationFailed", "Couldn't deploy version %d: %v", config.Status.LatestVersion, err)
return c.updateStatus(config, existingDeployments)
return c.updateStatus(config, existingDeployments, true)
}
}
c.recorder.Eventf(config, v1.EventTypeWarning, "DeploymentCreationFailed", "Couldn't deploy version %d: %s", config.Status.LatestVersion, err)
// We don't care about this error since we need to report the create failure.
cond := appsutil.NewDeploymentCondition(appsapi.DeploymentProgressing, kapi.ConditionFalse, appsapi.FailedRcCreateReason, err.Error())
_ = c.updateStatus(config, existingDeployments, *cond)
_ = c.updateStatus(config, existingDeployments, true, *cond)
return fmt.Errorf("couldn't create deployment for deployment config %s: %v", appsutil.LabelForDeploymentConfig(config), err)
}
msg := fmt.Sprintf("Created new replication controller %q for version %d", created.Name, config.Status.LatestVersion)
Expand All @@ -206,7 +231,7 @@ func (c *DeploymentConfigController) Handle(config *appsapi.DeploymentConfig) er
}

cond := appsutil.NewDeploymentCondition(appsapi.DeploymentProgressing, kapi.ConditionTrue, appsapi.NewReplicationControllerReason, msg)
return c.updateStatus(config, existingDeployments, *cond)
return c.updateStatus(config, existingDeployments, true, *cond)
}

// reconcileDeployments reconciles existing deployment replica counts which
Expand Down Expand Up @@ -285,13 +310,13 @@ func (c *DeploymentConfigController) reconcileDeployments(existingDeployments []
c.recorder.Eventf(config, v1.EventTypeWarning, "ReplicationControllerCleanupFailed", "Couldn't clean up replication controllers: %v", err)
}

return c.updateStatus(config, updatedDeployments)
return c.updateStatus(config, updatedDeployments, true)
}

// Update the status of the provided deployment config. Additional conditions will override any other condition in the
// deployment config status.
func (c *DeploymentConfigController) updateStatus(config *appsapi.DeploymentConfig, deployments []*v1.ReplicationController, additional ...appsapi.DeploymentCondition) error {
newStatus := calculateStatus(config, deployments, additional...)
func (c *DeploymentConfigController) updateStatus(config *appsapi.DeploymentConfig, deployments []*v1.ReplicationController, updateObservedGeneration bool, additional ...appsapi.DeploymentCondition) error {
newStatus := calculateStatus(config, deployments, updateObservedGeneration, additional...)

// NOTE: We should update the status of the deployment config only if we need to, otherwise
// we hotloop between updates.
Expand Down Expand Up @@ -376,7 +401,7 @@ func (c *DeploymentConfigController) cancelRunningRollouts(config *appsapi.Deplo
return nil
}

func calculateStatus(config *appsapi.DeploymentConfig, rcs []*v1.ReplicationController, additional ...appsapi.DeploymentCondition) appsapi.DeploymentConfigStatus {
func calculateStatus(config *appsapi.DeploymentConfig, rcs []*v1.ReplicationController, updateObservedGeneration bool, additional ...appsapi.DeploymentCondition) appsapi.DeploymentConfigStatus {
// UpdatedReplicas represents the replicas that use the current deployment config template which means
// we should inform about the replicas of the latest deployment and not the active.
latestReplicas := int32(0)
Expand All @@ -394,10 +419,15 @@ func calculateStatus(config *appsapi.DeploymentConfig, rcs []*v1.ReplicationCont
unavailableReplicas = 0
}

generation := config.Status.ObservedGeneration
if updateObservedGeneration {
generation = config.Generation
}

status := appsapi.DeploymentConfigStatus{
LatestVersion: config.Status.LatestVersion,
Details: config.Status.Details,
ObservedGeneration: config.Generation,
ObservedGeneration: generation,
Replicas: appsutil.GetStatusReplicaCountForDeployments(rcs),
UpdatedReplicas: latestReplicas,
AvailableReplicas: available,
Expand Down Expand Up @@ -519,17 +549,17 @@ func (c *DeploymentConfigController) cleanupOldDeployments(existingDeployments [
// triggers were activated (config change or image change). The first bool indicates that
// the triggers are active and second indicates if we should skip the rollout because we
// are waiting for the trigger to complete update (waiting for image for example).
func triggerActivated(config *appsapi.DeploymentConfig, latestExists bool, latestDeployment *v1.ReplicationController, codec runtime.Codec) (bool, bool) {
func triggerActivated(config *appsapi.DeploymentConfig, latestExists bool, latestDeployment *v1.ReplicationController, codec runtime.Codec) (bool, bool, error) {
if config.Spec.Paused {
return false, false
return false, false, nil
}
imageTrigger := appsutil.HasImageChangeTrigger(config)
configTrigger := appsutil.HasChangeTrigger(config)
hasTrigger := imageTrigger || configTrigger

// no-op when no triggers are defined.
if !hasTrigger {
return false, false
return false, false, nil
}

// Handle initial rollouts
Expand All @@ -542,50 +572,49 @@ func triggerActivated(config *appsapi.DeploymentConfig, latestExists bool, lates
// TODO: Technically this is not a config change cause, but we will have to report the image that caused the trigger.
// In some cases it might be difficult because config can have multiple ICT.
appsutil.RecordConfigChangeCause(config)
return true, false
return true, false, nil
}
glog.V(4).Infof("Rolling out initial deployment for %s/%s deferred until its images are ready", config.Namespace, config.Name)
return false, true
return false, true, nil
}
// Rollout if we only have config change trigger.
if configTrigger {
glog.V(4).Infof("Rolling out initial deployment for %s/%s", config.Namespace, config.Name)
appsutil.RecordConfigChangeCause(config)
return true, false
return true, false, nil
}
// We are waiting for the initial RC to be created.
return false, false
return false, false, nil
}

// Wait for the RC to be created
if !latestExists {
return false, true
return false, false, nil
}

// We need existing deployment at this point to compare its template with current config template.
if latestDeployment == nil {
return false, false
return false, false, nil
}

if imageTrigger {
if ok, imageNames := appsutil.HasUpdatedImages(config, latestDeployment); ok {
glog.V(4).Infof("Rolling out #%d deployment for %s/%s caused by image changes (%s)", config.Status.LatestVersion+1, config.Namespace, config.Name, strings.Join(imageNames, ","))
appsutil.RecordImageChangeCauses(config, imageNames)
return true, false
return true, false, nil
}
}

if configTrigger {
isLatest, changes, err := appsutil.HasLatestPodTemplate(config, latestDeployment, codec)
if err != nil {
glog.Errorf("Error while checking for latest pod template in replication controller: %v", err)
return false, true
return false, false, fmt.Errorf("error while checking for latest pod template in replication controller: %v", err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why returning this error? isn't logging it enough?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you return error it will get rate limited if it's repeating and it should not update status as it failed to act on it

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

appsutil.HasLatestPodTemplate can only fail when decoding deployment config, I guess it won't even get this far if decoding fail...

Maybe we can convert the second bool to an error and then check that error time (ErrSkipTrigger) or something?

Copy link
Contributor Author

@tnozicka tnozicka Jan 9, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

appsutil.HasLatestPodTemplate can only fail when decoding deployment config, I guess it won't even get this far if decoding fail...

which is likely a permanent failure which the ratelimiting queue will shield us from

Maybe we can convert the second bool to an error and then check that error time (ErrSkipTrigger) or something?

seems too obscure; skipping the trigger is not an error

}
if !isLatest {
glog.V(4).Infof("Rolling out #%d deployment for %s/%s caused by config change, diff: %s", config.Status.LatestVersion+1, config.Namespace, config.Name, changes)
appsutil.RecordConfigChangeCause(config)
return true, false
return true, false, nil
}
}
return false, false
return false, false, nil
}
Original file line number Diff line number Diff line change
Expand Up @@ -452,6 +452,9 @@ func newInt32(i int32) *int32 {

func newDC(version, replicas, maxUnavailable int, cond appsapi.DeploymentCondition) *appsapi.DeploymentConfig {
return &appsapi.DeploymentConfig{
ObjectMeta: metav1.ObjectMeta{
Generation: 1,
},
Spec: appsapi.DeploymentConfigSpec{
Replicas: int32(replicas),
Strategy: appsapi.DeploymentStrategy{
Expand Down Expand Up @@ -502,8 +505,9 @@ func TestCalculateStatus(t *testing.T) {
tests := []struct {
name string

dc *appsapi.DeploymentConfig
rcs []*v1.ReplicationController
dc *appsapi.DeploymentConfig
rcs []*v1.ReplicationController
updateObservedGeneration bool

expected appsapi.DeploymentConfigStatus
}{
Expand All @@ -528,6 +532,29 @@ func TestCalculateStatus(t *testing.T) {
},
},
},
{
name: "available deployment with updating observedGeneration",

dc: newDC(3, 3, 1, availableCond),
rcs: []*v1.ReplicationController{
newRC(3, 2, 2, 1, 1),
newRC(2, 0, 0, 0, 0),
newRC(1, 0, 1, 1, 1),
},
updateObservedGeneration: true,

expected: appsapi.DeploymentConfigStatus{
LatestVersion: int64(3),
ObservedGeneration: int64(1),
Replicas: int32(3),
ReadyReplicas: int32(2),
AvailableReplicas: int32(2),
UpdatedReplicas: int32(2),
Conditions: []appsapi.DeploymentCondition{
availableCond,
},
},
},
{
name: "unavailable deployment",

Expand All @@ -552,7 +579,7 @@ func TestCalculateStatus(t *testing.T) {
}

for _, test := range tests {
status := calculateStatus(test.dc, test.rcs)
status := calculateStatus(test.dc, test.rcs, test.updateObservedGeneration)
if !reflect.DeepEqual(status, test.expected) {
t.Errorf("%s: expected status:\n%+v\ngot status:\n%+v", test.name, test.expected, status)
}
Expand Down
11 changes: 7 additions & 4 deletions pkg/image/trigger/deploymentconfigs/deploymentconfigs.go
Original file line number Diff line number Diff line change
Expand Up @@ -128,13 +128,19 @@ func (i deploymentConfigTriggerIndexer) Index(obj, old interface{}) (string, *tr
default:
// updated
dc = obj.(*appsapi.DeploymentConfig)
oldDC := old.(*appsapi.DeploymentConfig)
triggers = calculateDeploymentConfigTriggers(dc)
oldTriggers := calculateDeploymentConfigTriggers(old.(*appsapi.DeploymentConfig))
oldTriggers := calculateDeploymentConfigTriggers(oldDC)
switch {
case len(oldTriggers) == 0:
change = cache.Added
case !reflect.DeepEqual(oldTriggers, triggers):
change = cache.Updated
// We need to react on image changes as well. Image names could change,
// images could be set to different value or resetted to "" e.g. by oc apply
// and we need to make sure those changes get reconciled by re-resolving images
case !reflect.DeepEqual(dc.Spec.Template.Spec.Containers, oldDC.Spec.Template.Spec.Containers):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be check only the images then?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

worst case this will be processed but not triggered, but I think we need some common helper like HasUpdatedImages() to work well for this and for other places. this is fine now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

discussed on IRC but for the record: I don't see other way we can determine if it needs to be queued without actually doing all the work in UpdateDeploymentConfigImages. This is just a check for adding to work queue, it should be fast and simple. After picking it up from queue we already do all the checks for updated images and others.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens when I want to run my own image for sometime and not the image resolved by the trigger? Can I do that now?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Kargakis set automatic to false
if it's active the state should always be reconciled

change = cache.Updated
}
}

Expand Down Expand Up @@ -187,9 +193,6 @@ func UpdateDeploymentConfigImages(dc *appsapi.DeploymentConfig, tagRetriever tri
glog.V(4).Infof("trigger %#v in deployment %s is not resolveable", p, dc.Name)
return nil, false, nil
}
if ref == p.LastTriggeredImage {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this going away?

Copy link
Contributor Author

@tnozicka tnozicka Jan 9, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because it would stop DC...image from being reconciled. If someone overwrites it manually, this will prevent reconciliation which should have been done

continue
}

if len(ref) == 0 {
ref = p.LastTriggeredImage
Expand Down
Loading