NiFi upgrade doesn't work #238

soenkeliebau · 2022-03-17T14:33:11Z

Affected version

0.5.0

Current and expected behavior

Scenario
A NifiCluster with three nodes was deployed with version 1.13.2 and is up and running.

The NifiCluster CRD is now changed to version 1.15.0.

Current Behavior
The StatefulSet is updated with the new image and triggers a rolling restart of the NiFi Pods with the new container image set to NiFi 1.15.0.

However NiFi does not support running a cluster with mixed versions, instead a full stop and restart with the new version is required.

Reference ticket:
https://issues.apache.org/jira/browse/NIFI-4068?jql=project%20%3D%20NIFI%20AND%20text%20~%20%22rolling%20upgrade%22

Due to this, the new pod never successfully starts and the restart hangs indefinitely, or until the user deletes all pods and they are rewritten with the same version by the StatefulSet.

Expected Behavior
The operator should notice that a version changes is happening and trigger a full restart of NiFi.

This is done when

I can update my NiFi CR from one NiFi version to another and have the operator automatically run a full-restart of NiFi to have a fully running new NiFi (new = new version)
The solution to implement this can be reused by other operators for similar scenarios
The proposed solution has been discussed in the architecture meeting before it's fully implemented
The implementation solution has been documented in the Contributor's Guide

Possible solution

The operator needs to be able to recognize a version change during reconciliation and then act accordingly to perform a full cluster restart.

Something along the lines of this code (suggested by @teozkr ) might work:

if current_sts.spec.template.spec.image != new_image:
  if current_sts.status.replicas > 0:
    # Wait for all current replicas to die
    new_sts.spec.replicas = 0
  else:
    # All old replicas are dead, do the upgrade
    new_sts.spec.replicas = rolegroup.replicas
    new_sts.spec.template.spec.image = new_image

Environment

This should be reproducable independently of the K8s environment.

The text was updated successfully, but these errors were encountered:

…Fi version. fixes #238 Signed-off-by: Sönke Liebau <[email protected]>

lfrancke · 2022-09-28T11:15:27Z

None of the boxes have been checked.
Do they still make sense and if so could you check what's been done?

@razvan @soenkeliebau

soenkeliebau · 2022-10-04T11:38:37Z

I'm afraid the boxes are not really checkable at the moment, as we did not implement a generic solution for this. Not sure if I wrote those checkboxes back in the day, but I'd say we don't need something abstract at the moment, as this only affects NiFi at the moment.
Opinions?

lfrancke · 2022-10-04T11:42:24Z

No, I'm fine with that and I'm fine with not checking the boxes.
Do we need a follow-up ticket then or can we create one if something comes up?

soenkeliebau · 2022-10-04T13:58:43Z

We can create that as and when needed I think.

soenkeliebau added the type/bug label Mar 17, 2022

soenkeliebau mentioned this issue Mar 17, 2022

Nifi monitoring script stackabletech/docker-images#66

Merged

soenkeliebau added a commit that referenced this issue Aug 11, 2022

Added code to perform full cluster stop when changing the deployed Ni…

566ea3e

…Fi version. fixes #238 Signed-off-by: Sönke Liebau <[email protected]>

soenkeliebau added a commit that referenced this issue Aug 11, 2022

Added code to perform full cluster stop when changing the deployed Ni…

1c8d33e

…Fi version. fixes #238 Signed-off-by: Sönke Liebau <[email protected]>

soenkeliebau mentioned this issue Aug 11, 2022

[Merged by Bors] - Added code to perform full cluster stop when changing the deployed NiFi version #323

Closed

7 tasks

lfrancke moved this to Development: Waiting for Review in Stackable Engineering Aug 12, 2022

lfrancke added this to Stackable Engineering Aug 12, 2022

lfrancke moved this from Development: Waiting for Review to Development: In Review in Stackable Engineering Aug 12, 2022

lfrancke removed this from Stackable Engineering Aug 23, 2022

lfrancke moved this to Development: Track in Stackable Engineering Aug 23, 2022

lfrancke added this to Stackable Engineering Aug 23, 2022

lfrancke assigned soenkeliebau Aug 31, 2022

sbernauer moved this from Development: Track to Development: In Progress in Stackable Engineering Sep 5, 2022

soenkeliebau assigned razvan Sep 19, 2022

lfrancke moved this from Development: In Progress to Development: Waiting for Review in Stackable Engineering Sep 20, 2022

lfrancke moved this from Development: Waiting for Review to Development: In Review in Stackable Engineering Sep 20, 2022

bors bot closed this as completed in 6bc536e Sep 23, 2022

razvan moved this from Development: In Review to Development: Done in Stackable Engineering Sep 23, 2022

lfrancke moved this from Development: Done to Acceptance: Waiting for in Stackable Engineering Sep 26, 2022

lfrancke moved this from Acceptance: Waiting for to Acceptance: In Progress in Stackable Engineering Sep 28, 2022

lfrancke self-assigned this Oct 4, 2022

lfrancke moved this from Acceptance: In Progress to Done in Stackable Engineering Oct 4, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

NiFi upgrade doesn't work #238

NiFi upgrade doesn't work #238

soenkeliebau commented Mar 17, 2022 •

edited

Loading

lfrancke commented Sep 28, 2022

soenkeliebau commented Oct 4, 2022

lfrancke commented Oct 4, 2022

soenkeliebau commented Oct 4, 2022

NiFi upgrade doesn't work #238

NiFi upgrade doesn't work #238

Comments

soenkeliebau commented Mar 17, 2022 • edited Loading

Affected version

Current and expected behavior

Possible solution

Environment

lfrancke commented Sep 28, 2022

soenkeliebau commented Oct 4, 2022

lfrancke commented Oct 4, 2022

soenkeliebau commented Oct 4, 2022

soenkeliebau commented Mar 17, 2022 •

edited

Loading