Skip to content
This repository has been archived by the owner on Feb 22, 2022. It is now read-only.

[stable/rabbitmq] Recover from "Waiting for Mnesia tables" after all nodes forced shutdown #13485

Closed
denis111 opened this issue May 3, 2019 · 33 comments · Fixed by #14149
Closed

Comments

@denis111
Copy link
Contributor

denis111 commented May 3, 2019

Is your feature request related to a problem? Please describe.
Yes, in our dev/staging environment in AWS we turn-off at night time (set scaling group to 0) the nodes of EKS cluster to save costs and we enabled persistence for rabbitmq, so it was like unexpected shutdown for all rabbitmq nodes and it can't recover with "Waiting for Mnesia tables", i tried setting podManagementPolicy and service.alpha.kubernetes.io/tolerate-unready-endpoints: "true" but it had no effect, still keeps looping with "Waiting for Mnesia tables"

Describe the solution you'd like
Can this solution be applied #9645 (comment) as an option? We'd really prefer availability over integrity.

Describe alternatives you've considered
Don't use persistence.

@miguelaeh
Copy link
Collaborator

Hi @denis111 ,
Are you able to know the order in which the nodes were shut down? I am not a RabbitMQ expert but in the documentation you can see this:

Normally when you shut down a RabbitMQ cluster altogether, the first node you restart should be the last one to go down, since it may have seen things happen that other nodes did not. But sometimes that's not possible: for instance if the entire cluster loses power then all nodes may think they were not the last to shut down.

Link to documentation: https://www.rabbitmq.com/rabbitmqctl.8.html#force_boot

Could you try if the force_boot works in case you not know the order?

@denis111
Copy link
Contributor Author

@miguelaeh Thank you for answering. No, we can't know the order, this is just autoscaling group with schedule to scale to 0 instances at night time when nobody's working. So it's unacceptable for us if anything is not able to recover from such "disaster" as sort of "unexpected" shut down of all nodes so in that case we prefer not using persistence because we prefer availability over integrity.

I hope I'll try to play with force_boot this friday and I will let you know if it worked.

@miguelaeh
Copy link
Collaborator

Thank you @denis111 ,
let me know what happen when you try with this option.

@denis111
Copy link
Contributor Author

Well, first, I can't execute rabbitmqctl force_boot because it says "Error: this command requires the 'rabbit' app to be stopped on the target node. Stop it with 'rabbitmqctl stop_app'.". But if we stop it the pod will just restart without giving a possibility to execute "abbitmqctl force_boot"...
So i created force_load file in "/opt/bitnami/rabbitmq/var/lib/rabbitmq/mnesia/rabbit@rabbitmq-pre-0.rabbitmq-pre-headless.pre.svc.cluster.local" in my case, and it worked!

@miguelaeh
Copy link
Collaborator

I'm glad it worked.
That is a cool solution.

@denis111
Copy link
Contributor Author

Yes but how to autmate it? I mean the creation of force_load file in some init container maybe...

@miguelaeh
Copy link
Collaborator

You could try mounting the file in the init container via a Configmap, but if you don't need to execute any command before the main container start, I guess you could just mount the file in the main container (also with a Configmap).

@denis111
Copy link
Contributor Author

I can't find the "helm way" to do it, existing configmap template in rabbitmq chart doesn't allow to add extra files as well as statefuset template doesn't allow to add some extra init container or extra volume mount...

@miguelaeh
Copy link
Collaborator

The Chart does not support that at this moment. You have to add it manually (you can clone the repository and modify the Chart with your needs),

@denis111
Copy link
Contributor Author

Well, we'd like to use mainstream chart, I'll see if I can make pull request then.

@denis111
Copy link
Contributor Author

denis111 commented Jun 4, 2019

I've detected if we enable forceBoot option on a new install without existing PVC (with clean new volume) then RabbitMQ is unable to start with Error: enoent. I'm creating a PR to address this issue.

alemorcuq pushed a commit to alemorcuq/charts-1 that referenced this issue Jun 6, 2019
…helm#13485) (helm#14491)

* [stable/rabbitmq] fix Error: enoent with forceBoot on new install (see helm#13485)

Signed-off-by: Denis Kalgushkin <[email protected]>

* [stable/rabbitmq] Bump chart version for PR 14491 (see helm#13485)

Signed-off-by: Denis Kalgushkin <[email protected]>
anasinnyk pushed a commit to MacPaw/charts that referenced this issue Jun 29, 2019
…helm#13485) (helm#14491)

* [stable/rabbitmq] fix Error: enoent with forceBoot on new install (see helm#13485)

Signed-off-by: Denis Kalgushkin <[email protected]>

* [stable/rabbitmq] Bump chart version for PR 14491 (see helm#13485)

Signed-off-by: Denis Kalgushkin <[email protected]>
Signed-off-by: Andrii Nasinnyk <[email protected]>
@mhyousefi
Copy link

mhyousefi commented Jan 28, 2020

Hey @denis111, so I'm having this issue with the latest version of stable/rabbitmq-ha, i.e.

[warning] <0.311.0> Error while waiting for Mnesia tables: {timeout_waiting_for_tables,[rabbit_durable_queue]}

Do I need to make any modifications to the values.yaml to make use of your adjustments?

@mhyousefi
Copy link

Actually, removing the pvcs before redeploying my rabbit resolved the problem for me.

@akrepon
Copy link

akrepon commented Feb 5, 2020

I think the the easiest solution is to have an init container which deletes the mnesia folder during startup.

@andylippitt
Copy link

if the rabbit node is waiting for the other nodes to come up, and it's not coming up because the statefulset is booting them sequentially, how about podManagementPolicy: Parallel

Parallel: "will create pods in parallel to match the desired scale without waiting, and on scale down will delete all pods at once" - https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.11/#statefulsetspec-v1-apps

@andylippitt
Copy link

some basic testing with: podManagementPolicy: Parallel

As expected all nodes start simultaneously and it seems to recover correctly. Further, with missing volumes, the node-discovery/cluster-init seems work as expected, however the cluster name was randomized if that matters to you.

@vkasgit
Copy link

vkasgit commented Feb 12, 2020

@andylippitt smart idea to use podManagementPolicy: Parallel
However was thinking what would happen if you upgrade your RMQ cluster eg. new rmq image...in that case will it not take down all the pods at once and recreate all of them in parallel with new image?

@vkasgit
Copy link

vkasgit commented Feb 13, 2020

ignore my comment above, the updateStrategy: RollingUpdate will take care of that situation.

Do you think in any edge case scenario having podManagementPolicy: Parallel will create an outage since it takes down all 3 pods?

@andylippitt
Copy link

I have found a problem with podManagementPolicy: Parallel. There's a race condition on initialization if you don't specify rabbitmqErlangCookie. I now have a condition where a single pod is running with a RABBITMQ_ERLANG_COOKIE which is different from the current value of the secret. I suspect this was a result of concurrent initialization and will try to reinstall with an explicit value.

@vkasgit
Copy link

vkasgit commented Feb 18, 2020

Yes I did test with podManagementPolicy: Parallel but faced issues. sometimes pods did not comeback healthy. Instead setting force_boot flag to true which was suggested in the previous thread worked for me. I tested multiple times bringing the pods all together at once, bringing pods one at a time within like 2-5 mins creating some sort of a mess and with that flag on all the pods came back healthy.

Additional setting in custom values.yaml:
We are also setting the lifecycle for a pod so the pods have graceful termination when it is taken down.
lifecycle:
preStop:
exec:
command: ["rabbitmqctl","shutdown"]

@andylippitt
Copy link

andylippitt commented Feb 18, 2020

@vkasgit were you specifying an explicit value for rabbitmqErlangCookie in your failed testing?

Edit: I think the issue is not a concurrency issue, rather in our case we just ran into this: https://github.com/helm/charts/issues/5167 tl;dr: specify rabbitmqErlangCookie in your prod installs

@vkasgit
Copy link

vkasgit commented Feb 19, 2020

in our installation that was failing i had the secret pre-created with erlang cookie

@zffocussss
Copy link

zffocussss commented Mar 31, 2020

Actually, removing the pvcs before redeploying my rabbit resolved the problem for me.

is it safe to remove the pvc?

@hakanozanlagan
Copy link

Actually, removing the pvcs before redeploying my rabbit resolved the problem for me.

is it safe to remove the pvc?

no it's not safe. this mounts keeps important files (db files etc)
volumeMounts:
- mountPath: /var/lib/rabbitmq
name: data
- mountPath: /etc/rabbitmq
name: config
- mountPath: /etc/definitions
name: definitions
readOnly: true
dnsPolicy: ClusterFirst

my problem solved with below method. (change clustername). run exec command when pod state is Running
kubectl exec -ti clustername-rbmq-rabbitmq-ha-0 /bin/sh
cd var/lib/rabbitmq/mnesia/rabbit@perfx-rbmq-rabbitmq-ha-0.clustername-rbmq-rabbitmq-ha-discovery.hrnext-prod.svc.cluster.local
touch force_load

watch for pod statuses

@rakeshnambiar
Copy link

@hakanozanlagan first of all thanks for posting your solution. I tried the same step and unfortunately, I am using the user rabbitmq which don't have the permission in the folder and I don't have any sudo access. Is there any alternative solution?

image

@vkasgit
Copy link

vkasgit commented May 14, 2020

@rakeshnambiar In your helm chart values.yml Did you explicitly try setting the force_boot flag to true? Try that option. Also check your user permissions as well. Those can also be set throught he values.yml

@rakeshnambiar
Copy link

Hi @vkasgit thanks the force_boot solved the issue and I can also see the runAsUser etc on the values yaml. Btw - by default, it's created 3 PODs and I can see 3 PVCs as well. Is this expected?

@ytjohn
Copy link

ytjohn commented May 19, 2020

@vkasgit We are occasionally running into this mnesia table issue ourselves (which we have been fixing by deleting the pvc). I was curious if by setting updateStrategy to RollingUpdate (instead of the default onDelete) eliminates the need for force_boot? Our podManagementPolicy is the default of OrderedReady. In fact, other than basic password and policies, our values are otherwise default.

@vkasgit
Copy link

vkasgit commented May 19, 2020

@ytjohn Do you have the forceBoot: true flag on and still occasionally run into mnesia table?

The following are some settings I have and haven't run into mnesia issue so far(knock on the wood). Try adding lifecycle and see if that helps. What that does is when RMQ node is forcefully taken down the prestop command kicks in and makes a graceful termination.

podManagementPolicy: OrderedReady
updateStrategy: RollingUpdate
forceBoot: true

lifecycle:
preStop:
exec:
command: [rabbitmqctl, shutdown]

EDIT: Please check the indents of lifecyle. unable to indent properly

@ytjohn
Copy link

ytjohn commented May 19, 2020

I haven't tried forceBoot: true yet , but it seemed a rolling update would pretty much take care of the need for for forceBoot. That said, I don't think it will hurt either, so we will go ahead and set them both and if that keeps the mnesia table issue from popping up, we'll call it good. Thank you.

@Davidrjx
Copy link

Davidrjx commented Jun 6, 2020

@akrepon deleting mnesia database would cause crashed pod as standalone or blank node

@Davidrjx
Copy link

Davidrjx commented Jun 7, 2020

Actually, removing the pvcs before redeploying my rabbit resolved the problem for me.

is it safe to remove the pvc?

no it's not safe. this mounts keeps important files (db files etc)
volumeMounts:

  • mountPath: /var/lib/rabbitmq
    name: data
  • mountPath: /etc/rabbitmq
    name: config
  • mountPath: /etc/definitions
    name: definitions
    readOnly: true
    dnsPolicy: ClusterFirst

my problem solved with below method. (change clustername). run exec command when pod state is Running
kubectl exec -ti clustername-rbmq-rabbitmq-ha-0 /bin/sh
cd var/lib/rabbitmq/mnesia/rabbit@perfx-rbmq-rabbitmq-ha-0.clustername-rbmq-rabbitmq-ha-discovery.hrnext-prod.svc.cluster.local
touch force_load

watch for pod statuses

i do not fully understand your solution about cluster_name change, just only touching force_load file in mnesia data dir?

@minhnguyenvan95
Copy link

Well, first, I can't execute rabbitmqctl force_boot because it says "Error: this command requires the 'rabbit' app to be stopped on the target node. Stop it with 'rabbitmqctl stop_app'.". But if we stop it the pod will just restart without giving a possibility to execute "abbitmqctl force_boot"...
So i created force_load file in "/opt/bitnami/rabbitmq/var/lib/rabbitmq/mnesia/rabbit@rabbitmq-pre-0.rabbitmq-pre-headless.pre.svc.cluster.local" in my case, and it worked!

for rabbitmq install from helm chart it is /var/lib/rabbitmq/mnesia/rabbit@rabbitmq-ha-0.rabbitmq-ha-discovery.acbo-queues.svc.cluster.local

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet