Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rolling upgrade of MongooseIM cluster #3012

Merged
merged 5 commits into from
Jan 25, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
25 changes: 25 additions & 0 deletions doc/operation-and-maintenance/Cluster-restart.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
When you are using a MongooseIM cluster that is using Mnesia backend for any extensions, there could occur an issue related to the distributed Mnesia nodes.

## How to restart a cluster:

Having Node A and Node B, the cluster restart procedure should occur in the following way:
chrzaszcz marked this conversation as resolved.
Show resolved Hide resolved

![How to restart a cluster](cluster_restart.png)

Start the nodes in the opposite order to the one in which they were stopped.
The first node you restart should be the last one to go down.
For cluster with 3 nodes, after stopping the nodes `ABC`, they should be started in `CBA` order.

## How NOT to restart a cluster:

Having Node A and Node B.

![How not to restart a cluster](incorrect_cluster_restart.png)

When the nodes are stopped in `AB` order, starting the node `A` first can result in issues related to the distributed Mnesia nodes and not bring up a node that is fully operational.

Changing the order of the restarted nodes can cause issues with distributed Mnesia.
Make sure to follow the recommendations if you are using Mnesia backend for any of the extensions.
Please note that for some of the extensions, the Mnesia backend is set by default without having that configured explicitly in the configuration file.

For more information related to the cluster configuration and maintenance, please see [Cluster configuration and node management](Cluster-configuration-and-node-management.md) section.

This file was deleted.

89 changes: 89 additions & 0 deletions doc/operation-and-maintenance/Rolling-upgrade.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,89 @@
## Rolling upgrade
For all MongooseIM production deployments we recommend running multiple server nodes connected in a cluster behind a load-balancer.
Rolling upgrade is a process of upgrading MongooseIM cluster, one node at a time.
Make sure you have at least the number of nodes able to handle your traffic plus one before the rolling upgrade to guarantee the availability and minimise the downtime.
Running different MongooseIM versions at the same time beyond the duration of the upgrade is not recommended and not supported.

Rolling upgrade procedure is recommended over configuration reload which is not supported since version 4.1.
chrzaszcz marked this conversation as resolved.
Show resolved Hide resolved

Please note that more complex upgrades that involve schema updates, customisations or have functional changes might require more specific and specially crafted migration procedure.

If you want just to make the changes to the configuration file, please follow steps 1, 3, 4, 6, 7, 8.
This type of change can also be done one node at a time.
It would require you to check the cluster status, modify the configuration file and restart the node.

The usual MongooseIM cluster upgrade can be achieved with the following steps:

### 1. Check the cluster status.

Use the following command on the running nodes and examine the status of the cluster:

```bash
mongooseimctl mnesia info | grep "running db nodes"

running db nodes = [mongooseim@node1, mongooseim@node2]
```

This command shows all running nodes.
A healthy cluster should list all nodes that are part of the cluster.

Should you have any issues related to node clustering, please refer to [Cluster configuration and node management](Cluster-configuration-and-node-management.md) section.

### 2. Copy the configuration file.

Make a copy of the configuration file before the upgrade, as some package managers might override your custom configuration with the default one.
Please note that since version 4.1 `*.cfg` MongooseIM configuration format is no longer supported and needs to be rewritten in the new `*.toml` format.

### 3. Apply the changes from the migration guide.

All modifications of the configuration file or updates of the database schema, that are required to perform version upgrade, can be found in the Migration Guide section.
When upgrading more than one version, please make sure to go over all consecutive migration guides.

For example, when migrating from MongooseIM 3.7 to 4.1, please familiarize yourself with and apply all necessary changes described in the following pages of the Migration Guide section.

* 3.7.0 to 4.0.0
* 4.0.0 to 4.0.1
* 4.0.1 to 4.1.0

### 4. Stop the running node.

Use the following command to stop the MognooseIM node:

```bash
mongooseimctl stop
```

### 5. Install new MongooseIM version.

You can get the new version of MongooseIM by either [building MongooseIM from source code](../user-guide/How-to-build.md) or [downloading and upgrading from package](../../user-guide/Getting-started/#download-a-package).

### 6. Start the node.

Use the following command to start and check the status of the MognooseIM node and the cluster:

```bash
mongooseimctl start
mongooseimctl status

mongooseimctl mnesia info | grep "running db nodes"
```

### 7. Test the cluster.

Please verify that the nodes are running and part of the same cluster.
If the cluster is working as expected, the migration of the node is complete.

### 8. Upgrade the remaining nodes.

Once all the prior steps are completed successfully, repeat the process for all nodes that are part of the MongooseIM cluster.

## Further cluster upgrade considerations

Another way to perform a cluster upgrade with minimising possible downtime would be to setup a parallel MongooseIM cluster running newer version.
You can redirect the incoming traffic to the new cluster with use of a load-balancer.

Once no connections are handled by the old cluster, it can by safely stopped and the migration is complete.

We highly recommend testing new software release in staging environment before it is deployed on production.

Should you need any help with the upgrade, deployments or load testing of your MongooseIM cluster, please reach out to us. MongooseIM consultancy and support is part of [our commercial offering](https://www.erlang-solutions.com/products/mongooseim.html).
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
3 changes: 2 additions & 1 deletion mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -84,7 +84,8 @@ nav:
- 'Logging configuration': 'operation-and-maintenance/Logging.md'
- 'Logging with Humio': 'operation-and-maintenance/Humio.md'
- 'Logging fields': 'operation-and-maintenance/Logging-fields.md'
- 'Reloading configuration on a running system': 'operation-and-maintenance/Reloading-configuration-on-a-running-system.md'
- 'Rolling upgrade': 'operation-and-maintenance/Rolling-upgrade.md'
- 'Cluster restart': 'operation-and-maintenance/Cluster-restart.md'
- 'Metrics': 'operation-and-maintenance/MongooseIM-metrics.md'
- 'System Metrics Privacy Policy': 'operation-and-maintenance/System-Metrics-Privacy-Policy.md'
- 'Distribution over TLS': 'operation-and-maintenance/tls-distribution.md'
Expand Down