Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Merged by Bors] - Added code to perform full cluster stop when changing the deployed NiFi version #323

Closed
wants to merge 18 commits into from

Conversation

soenkeliebau
Copy link
Member

Currenty we cannot upgrade NiFi without manual intervention, as NiFi can't run with mixed versions in a cluster and StateFulSets do not have an upgrade policy of "full".

This PR adds tracking of the deployed NiFi version to the status of the NiFi object and evaluates if a reconciliation signifies a change of the deployed version. If this is recognized a full stop of the cluster is performed and only then the updated version rolled out.

fixes #238

Signed-off-by: Sönke Liebau [email protected]

Description

Please add a description here. This will become the commit message of the merge request later.

Review Checklist

  • Code contains useful comments
  • CRD change approved (or not applicable)
  • (Integration-)Test cases added (or not applicable)
  • Documentation added (or not applicable)
  • Changelog updated (or not applicable)
  • Cargo.toml only contains references to git tags (not specific commits or branches)
  • Helm chart can be installed and deployed operator works (or not applicable)

Once the review is done, comment bors r+ (or bors merge) to merge. Further information

@soenkeliebau soenkeliebau requested a review from a team August 11, 2022 15:37
Signed-off-by: Sönke Liebau <[email protected]>
This reverts commit 1d0e0a5.
Signed-off-by: Sönke Liebau <[email protected]>
Signed-off-by: Sönke Liebau <[email protected]>
@maltesander maltesander self-assigned this Aug 12, 2022
Copy link
Member

@maltesander maltesander left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested and works fine (only one zookeeper version)!

--- PASS: kuttl (348.76s)
    --- PASS: kuttl/harness (0.00s)
        --- PASS: kuttl/harness/upgrade_zookeeper-3.8.0-stackable0.7.0_nifi_old-1.15.0-stackable0.4.0_nifi_new-1.16.3-stackable0.1.0 (164.70s)
        --- PASS: kuttl/harness/upgrade_zookeeper-3.8.0-stackable0.7.0_nifi_old-1.16.0-stackable0.1.0_nifi_new-1.16.3-stackable0.1.0 (182.84s)
PASS

Some issues/questions:

  1. The (old) reporting task job is not getting deleted (missing in orphaned resources) after update
  2. Does it make sense to use one of the python tests (e.g. test_nifi.py to check for registered nodes) in the upgrade tests?
  3. Do we need a flow-file/data test? In terms of are the versions and their data really compatible? We do that e.g. in Kafka.
  4. Does it make sense to emit the log messages as events ("Cluster has been stopped for a restart, will scale back up.")?

rust/operator-binary/src/controller.rs Outdated Show resolved Hide resolved
rust/operator-binary/src/controller.rs Outdated Show resolved Hide resolved
rust/operator-binary/src/controller.rs Outdated Show resolved Hide resolved
tests/templates/kuttl/upgrade/00-assert.yaml Outdated Show resolved Hide resolved
@soenkeliebau
Copy link
Member Author

soenkeliebau commented Aug 12, 2022

  1. The (old) reporting task job is not getting deleted (missing in orphaned resources) after update

We need to have a look at that, yes. I noticed this as well, but figured this will mostly be an issue during testing, when we trigger the upgrade before the job for the old version is finished.
For the tests we could solve it "preliminarily" by adding a condition to wait for the job to succeed in the install NiFi step I think..

  1. Does it make sense to use one of the python tests (e.g. test_nifi.py to check for registered nodes) in the upgrade tests?

It absolutely does, I'll look into this when I get back. Or if someone else wants to pick this up in the meantime, please do feel free to!

  1. Do we need a flow-file/data test? In terms of are the versions and their data really compatible? We do that e.g. in Kafka.

We should have that, yes.

  1. Does it make sense to emit the log messages as events ("Cluster has been stopped for a restart, will scale back up.")?

It does!

soenkeliebau and others added 5 commits September 9, 2022 12:04
# Conflicts:
#	deploy/crd/nificluster.crd.yaml
#	docs/modules/ROOT/pages/usage.adoc
#	rust/crd/src/authentication.rs
#	rust/crd/src/lib.rs
@razvan razvan self-requested a review September 19, 2022 09:03
@razvan
Copy link
Member

razvan commented Sep 19, 2022

@maltesander maltesander self-requested a review September 20, 2022 07:11
Copy link
Member

@maltesander maltesander left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just minor things and a question:
Do we want to try loading a flow template / data in general?

@sbernauer used a job to ingest the flow: https://github.com/stackabletech/stackablectl/blob/main/demos/nifi-kafka-druid-water-level-data/create-nifi-ingestion-job.yaml and https://github.com/stackabletech/stackablectl/blob/main/demos/nifi-kafka-druid-water-level-data/IngestWaterLevelsToKafka.xml.
We could check in the up/downgrade that this is still available / working after the restart (a much easier flow)?

rust/operator-binary/src/controller.rs Outdated Show resolved Hide resolved
tests/templates/kuttl/upgrade/00-install-zk.yaml.j2 Outdated Show resolved Hide resolved
tests/templates/kuttl/upgrade/01-install-nifi.yaml.j2 Outdated Show resolved Hide resolved
tests/templates/kuttl/upgrade/02-assert.yaml Outdated Show resolved Hide resolved
tests/templates/kuttl/upgrade/03-assert.yaml Outdated Show resolved Hide resolved
tests/templates/kuttl/upgrade/04-upgrade-nifi.j2 Outdated Show resolved Hide resolved
tests/templates/kuttl/upgrade/07-prepare-test-nifi.yaml Outdated Show resolved Hide resolved
@razvan
Copy link
Member

razvan commented Sep 22, 2022

Copy link
Member

@maltesander maltesander left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice! LGTM.

@razvan
Copy link
Member

razvan commented Sep 23, 2022

bors merge

bors bot pushed a commit that referenced this pull request Sep 23, 2022
…Fi version (#323)

Currenty we cannot upgrade NiFi without manual intervention, as NiFi can't run with mixed versions in a cluster and StateFulSets do not have an upgrade policy of "full".

This PR adds tracking of the deployed NiFi version to the status of the NiFi object and evaluates if a reconciliation signifies a change of the deployed version. If this is recognized a full stop of the cluster is performed and only then the updated version rolled out.

fixes #238

Signed-off-by: Sönke Liebau <[email protected]>


# Description

*Please add a description here. This will become the commit message of the merge request later.*



Co-authored-by: Sönke Liebau <[email protected]>
Co-authored-by: Razvan-Daniel Mihai <[email protected]>
@bors
Copy link
Contributor

bors bot commented Sep 23, 2022

Pull request successfully merged into main.

Build succeeded:

@bors bors bot changed the title Added code to perform full cluster stop when changing the deployed NiFi version [Merged by Bors] - Added code to perform full cluster stop when changing the deployed NiFi version Sep 23, 2022
@bors bors bot closed this Sep 23, 2022
@bors bors bot deleted the feat/update branch September 23, 2022 11:45
bors bot pushed a commit that referenced this pull request Sep 26, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

NiFi upgrade doesn't work
3 participants