Skip to content

Commit

Permalink
Expand MDP deprecation docs with migration guide (#73367)
Browse files Browse the repository at this point in the history
This commit expands the note about the deprecation of multiple data
paths, adding guidance on how to configure a single filesystem spanning
multiple disks and how to migrate to such a configuration without
downtime.

Closes #71871
  • Loading branch information
DaveCTurner committed Jun 1, 2021
1 parent ab788be commit 19d8568
Showing 1 changed file with 84 additions and 5 deletions.
89 changes: 84 additions & 5 deletions docs/reference/migration/migrate_7_13.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -123,13 +123,92 @@ aggregations are rarely useful and often unintended.
[%collapsible]
====
*Details* +
Support for multiple paths in the `path.data` setting is now deprecated. We
introduced this option as a way to support multi-disk setups. It has since been
a source of user complaints due to confusing and unintuitive behavior.
The `path.data` setting accepts a list of data paths, but if you specify
multiple paths then the behaviour is unintuitive and usually does not give the
desired outcomes. Support for multiple data paths is now deprecated and will be
removed in 8.0.0.
*Impact* +
Specify a single path in `path.data`. To use multiple disks, use a RAID
hardware configuration or similar hardware solution.
Specify a single path in `path.data`. If needed, you can create a filesystem
which spans multiple disks with a hardware virtualisation layer such as RAID,
or a software virtualisation layer such as Logical Volume Manager (LVM) on
Linux or Storage Spaces on Windows. If you wish to use multiple data paths on a
single machine then you must run one node for each data path.
If you currently use multiple data paths in a
<<high-availability-cluster-design,highly available cluster>> then you can
migrate to a setup that uses a single path for each node without downtime using
a process similar to a <<restart-cluster-rolling,rolling restart>>: shut each
node down in turn and replace it with one or more nodes each configured to use
a single data path. In more detail, for each node that currently has multiple
data paths you should follow the following process.
1. Take a snapshot to protect your data in case of disaster.
2. Optionally, migrate the data away from the target node by using an
<<cluster-shard-allocation-filtering,allocation filter>>:
+
[source,console]
--------------------------------------------------
PUT _cluster/settings
{
"transient": {
"cluster.routing.allocation.exclude._name": "target-node-name"
}
}
--------------------------------------------------
+
You can use the <<cat-allocation,cat allocation API>> to track progress of this
data migration. If some shards do not migrate then the
<<cluster-allocation-explain,cluster allocation explain API>> will help you to
determine why.
3. Follow the steps in the <<restart-cluster-rolling,rolling restart process>>
up to and including shutting the target node down.
4. Ensure your cluster health is `yellow` or `green`, so that there is a copy
of every shard assigned to at least one of the other nodes in your cluster.
5. If applicable, remove the allocation filter applied in the earlier step.
+
[source,console]
--------------------------------------------------
PUT _cluster/settings
{
"transient": {
"cluster.routing.allocation.exclude._name": null
}
}
--------------------------------------------------
6. Discard the data held by the stopped node by deleting the contents of its
data paths.
7. Reconfigure your storage. For instance, combine your disks into a single
filesystem using LVM or Storage Spaces. Ensure that your reconfigured storage
has sufficient space for the data that it will hold.
8. Reconfigure your node by adjusting the `path.data` setting in its
`elasticsearch.yml` file. If needed, install more nodes each with their own
`path.data` setting pointing at a separate data path.
9. Start the new nodes and follow the rest of the
<<restart-cluster-rolling,rolling restart process>> for them.
10. Ensure your cluster health is `green`, so that every shard has been
assigned.
You can alternatively add some number of single-data-path nodes to your
cluster, migrate all your data over to these new nodes using
<<cluster-shard-allocation-filtering,allocation filters>>, and then remove the
old nodes from the cluster. This approach will temporarily double the size of
your cluster so it will only work if you have the capacity to expand your
cluster like this.
If you currently use multiple data paths but your cluster is not highly
available then the you can migrate to a non-deprecated configuration by taking
a snapshot, creating a new cluster with the desired configuration and restoring
the snapshot into it.
====

[[action-destructive-defaults-to-true]]
Expand Down

0 comments on commit 19d8568

Please sign in to comment.