-
-
Notifications
You must be signed in to change notification settings - Fork 106
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Updated RabbitMQ-Chart to 1.46.1 & improved Reboot-Resilience #158
Conversation
values.yaml
Outdated
# On unclean cluster restarts forceBoot is required to cleanup Mnesia tables (see: https://github.com/helm/charts/issues/13485) | ||
forceBoot: true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As every option is about trade-off, worth adding comment for this new default:
# On unclean cluster restarts forceBoot is required to cleanup Mnesia tables (see: https://github.com/helm/charts/issues/13485) | |
forceBoot: true | |
# On unclean cluster restarts forceBoot is required to cleanup Mnesia tables (see: https://github.com/helm/charts/issues/13485) | |
# Use it only if you prefer availability over integrity. | |
forceBoot: true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggested comment is added
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for PR! 👍
Just left a few minor comments to address and we're good to merge.
@@ -458,7 +460,7 @@ rabbitmq-ha: | |||
#rabbitmqMemoryHighWatermark: 512MB | |||
#rabbitmqMemoryHighWatermarkType: absolute | |||
# Up to 255 character string, should be fixed so that re-deploying the chart does not fail (see: https://github.com/helm/charts/issues/12371) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As we're defaulting rabbitmqErlangCookie
value for everyone, let's at least include a warning comment as recommendation to change the default.
# Up to 255 character string, should be fixed so that re-deploying the chart does not fail (see: https://github.com/helm/charts/issues/12371) | |
# Up to 255 character string, should be fixed so that re-deploying the chart does not fail (see: https://github.com/helm/charts/issues/12371) | |
# NB! It's highly recommended to change the default insecure rabbitmqErlangCookie value! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Warning is added as well
CHANGELOG.md
Outdated
@@ -1,7 +1,10 @@ | |||
# Changelog | |||
|
|||
## In Development | |||
|
|||
* Update `rabbitmq-ha` 3rd party chart from `1.44.1` to `1.46.1` (#158) (by @moonrail) | |||
* Disable newly introduced `rabbitmq-ha` prometheus operator by default (#158) (by @moonrail) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could be omitted as there's too verbose changelog for a single PR and we stick with the previous default.
* Disable newly introduced `rabbitmq-ha` prometheus operator by default (#158) (by @moonrail) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line is now removed - wasn't sure about three lines of Changelog for one PR either when writing it
values.yaml
Outdated
#rabbitmqErlangCookie: 8MrqQdCQ6AQ8U3MacSubHE5RqkSfvNaRHzvxuFcG | ||
rabbitmqErlangCookie: 8MrqQdCQ6AQ8U3MacSubHE5RqkSfvNaRHzvxuFcG |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was, in fact proposed in older PR and it's questionable insecure default to me.
I realize that it could be useful for someone re-deploying (destroying/creating) rabbitmq-ha for many times in a row. And it also could take time to understand they need the same rabbitmqErlangCookie
to make re-deployment not fail with the same PV/PVC.
But instead of forcing this rabbitmq cookie for everyone, we decided previously to hide the value under comment and include recommendation about why it could be useful or important. This way someone experiencing re-deployment issue would be able to consult with this Helm values hint.
Considering it's a second PR to enable rabbitmqErlangCookie
by default, - let's do it 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this was enabled in my cluster. I ran into this issue when we upgrade clusters (2 times) for Kubernetes version in EKS. AWS drains pods from old node groups and shift them to the new node groups. Even though all pods shift fine, due to the rabbitmqErlangCookie
mismatch and not found in the helm chart, i was running into the issue which required me to delete the PVCs and then run helm upgrade
to reconstruct the rabbitmq-ha cluster. Of course all that is a down-time. But since enabling that, i could see all pods shifted to the new nodes just fine and app was on-line the whole time! I would recommend it. Thanks!
Looks good! |
…bernetes installations. Enabled Erlang-Cookie & Force-Boot, as otherwise Cluster-Data is not reusable after Deployment-Restart. Due to new RabbitMQ-Version enabling Prometheus-Monitoring by default, most installations would fail, therefore disabled it by default.
Rebased, conflict should be gone now |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
Hello altogether
Potentially partly helps with the #11
This Pull Request updates RabbitMQs-Chart-Dependency to 1.46.1, as on our kubernetes installations (1.17, 1.18, 1.19) 1.44.1 would not run due to mysterious disk-space complaints while being correctly assigned to a PersistentVolume and being able to RW to it.
As this new RabbitMQ-Version enables Prometheus-Monitoring by default, most of current installations would fail, therefore this is disabled it by default.
Enabled
rabbitmqErlangCookie
, as otherwise Cluster-Data is not reusable after RabbitMQ-Deployment-Restart/-Rebuild (or short period of 0 Replicas).Added
forceBoot
, as otherwise Cluster-Data is not reusable after RabbitMQ-Deployment-Restart/-Rebuild (or short period of 0 Replicas), as Mnesia Tables are not cleaned up and cause RabbitMQ to not boot up. See helm/charts#13485So this should improve User Experience by not having to ditch PersistentVolumes after RabbitMQ-Redeployments.
Not really sure about this - but if StackStorm holds queued Executions in RabbitMQ, this would also help in disaster cases, to not loose all running Executions.
Tested with 3.3dev on kubernetes 1.17, 1.18 & 1.19.
Please let me know, if there is something to improve. :)