Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove persistent cluster settings tool #50694

Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
73 changes: 67 additions & 6 deletions docs/reference/commands/node-tool.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -3,9 +3,9 @@

The `elasticsearch-node` command enables you to perform certain unsafe
operations on a node that are only possible while it is shut down. This command
allows you to adjust the <<modules-node,role>> of a node and may be able to
recover some data after a disaster or start a node even if it is incompatible
with the data on disk.
allows you to adjust the <<modules-node,role>> of a node, unsafely edit cluster
settings and may be able to recover some data after a disaster or start a node
even if it is incompatible with the data on disk.

[float]
=== Synopsis
Expand All @@ -20,13 +20,17 @@ bin/elasticsearch-node repurpose|unsafe-bootstrap|detach-cluster|override-versio
[float]
=== Description

This tool has four modes:
This tool has five modes:

* `elasticsearch-node repurpose` can be used to delete unwanted data from a
node if it used to be a <<data-node,data node>> or a
<<master-node,master-eligible node>> but has been repurposed not to have one
or other of these roles.

* `elasticsearch-node remove-settings` can be used to remove persistent settings
from the cluster state in case where it contains incompatible settings that
prevent the cluster from forming.

* `elasticsearch-node unsafe-bootstrap` can be used to perform _unsafe cluster
bootstrapping_. It forces one of the nodes to form a brand-new cluster on
its own, using its local copy of the cluster metadata.
Expand Down Expand Up @@ -76,6 +80,26 @@ The tool provides a summary of the data to be deleted and asks for confirmation
before making any changes. You can get detailed information about the affected
indices and shards by passing the verbose (`-v`) option.

[float]
==== Removing persistent cluster settings

There may be situations where a node contains persistent cluster
settings that prevent the cluster from forming. Since the cluster cannot form,
it is not possible to remove these settings using the
<<cluster-update-settings>> API.

The `elasticsearch-node remove-settings` tool allows you to forcefully remove
those persistent settings from the on-disk cluster state. The tool takes a
list of settings as parameters that should be removed, and also supports
wildcard patterns.

The intended use is:

* Stop the node
* Run `elasticsearch-node remove-settings name-of-setting-to-remove` on the node
* Repeat for all other master-eligible nodes
* Start the nodes

[float]
==== Recovering data after a disaster

Expand Down Expand Up @@ -143,9 +167,9 @@ If there is at least one remaining master-eligible node, but it is not possible
to restart a majority of them, then the `elasticsearch-node unsafe-bootstrap`
command will unsafely override the cluster's <<modules-discovery-voting,voting
configuration>> as if performing another
<<modules-discovery-bootstrap-cluster,cluster bootstrapping process>>.
<<modules-discovery-bootstrap-cluster,cluster bootstrapping process>>.
The target node can then form a new cluster on its own by using
the cluster metadata held locally on the target node.
the cluster metadata held locally on the target node.

[WARNING]
These steps can lead to arbitrary data loss since the target node may not hold the latest cluster
Expand Down Expand Up @@ -290,6 +314,9 @@ it can join a different cluster.
`override-version`:: Overwrites the version number stored in the data path so
that a node can start despite being incompatible with the on-disk data.

`remove-settings`:: Forcefully removes the provided persistent cluster settings
from the on-disk cluster state.

`-E <KeyValuePair>`:: Configures a setting.

`-h, --help`:: Returns all of the command parameters.
Expand Down Expand Up @@ -346,6 +373,40 @@ Confirm [y/N] y
Node successfully repurposed to no-master and no-data.
----

[float]
==== Removing persistent cluster settings

If your nodes contain persistent cluster settings that prevent the cluster
from forming, i.e., can't be removed using the <<cluster-update-settings>> API,
you can run the following commands to remove one or more cluster settings.

[source,txt]
----
node$ ./bin/elasticsearch-node remove-settings xpack.monitoring.exporters.my_exporter.host

WARNING: Elasticsearch MUST be stopped before running this tool.

The following settings will be removed:
xpack.monitoring.exporters.my_exporter.host: "10.1.2.3"

You should only run this tool if you have incompatible settings in the
cluster state that prevent the cluster from forming.
This tool can cause data loss and its use should be your last resort.

Do you want to proceed?

Confirm [y/N] y

Settings were successfully removed from the cluster state
----

You can also use wildcards to remove multiple settings, for example using

[source,txt]
----
node$ ./bin/elasticsearch-node remove-settings xpack.monitoring.*
----

[float]
==== Unsafe cluster bootstrapping

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@
import org.elasticsearch.ElasticsearchException;
import org.elasticsearch.cli.EnvironmentAwareCommand;
import org.elasticsearch.cli.Terminal;
import org.elasticsearch.cli.UserException;
import org.elasticsearch.cluster.ClusterModule;
import org.elasticsearch.cluster.ClusterName;
import org.elasticsearch.cluster.ClusterState;
Expand Down Expand Up @@ -97,7 +98,7 @@ public static Tuple<Long, ClusterState> loadTermAndClusterState(PersistedCluster
return Tuple.tuple(bestOnDiskState.currentTerm, clusterState(env, bestOnDiskState));
}

protected void processNodePaths(Terminal terminal, OptionSet options, Environment env) throws IOException {
protected void processNodePaths(Terminal terminal, OptionSet options, Environment env) throws IOException, UserException {
terminal.println(Terminal.Verbosity.VERBOSE, "Obtaining lock for node");
try (NodeEnvironment.NodeLock lock = new NodeEnvironment.NodeLock(logger, env, Files::exists)) {
final Path[] dataPaths =
Expand Down Expand Up @@ -145,7 +146,8 @@ protected boolean validateBeforeLock(Terminal terminal, Environment env) {
* @param options the command line options
* @param env the env of the node to process
*/
protected abstract void processNodePaths(Terminal terminal, Path[] dataPaths, OptionSet options, Environment env) throws IOException;
protected abstract void processNodePaths(Terminal terminal, Path[] dataPaths, OptionSet options, Environment env)
throws IOException, UserException;

protected NodeEnvironment.NodePath[] toNodePaths(Path[] dataPaths) {
return Arrays.stream(dataPaths).map(ElasticsearchNodeCommand::createNodePath).toArray(NodeEnvironment.NodePath[]::new);
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,7 @@ public NodeToolCli() {
subcommands.put("unsafe-bootstrap", new UnsafeBootstrapMasterCommand());
subcommands.put("detach-cluster", new DetachClusterCommand());
subcommands.put("override-version", new OverrideNodeVersionCommand());
subcommands.put("remove-settings", new RemoveSettingsCommand());
}

public static void main(String[] args) throws Exception {
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,104 @@
/*
* Licensed to Elasticsearch under one or more contributor
* license agreements. See the NOTICE file distributed with
* this work for additional information regarding copyright
* ownership. Elasticsearch licenses this file to you under
* the Apache License, Version 2.0 (the "License"); you may
* not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing,
* software distributed under the License is distributed on an
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
* KIND, either express or implied. See the License for the
* specific language governing permissions and limitations
* under the License.
*/
package org.elasticsearch.cluster.coordination;

import joptsimple.OptionSet;
import joptsimple.OptionSpec;
import org.elasticsearch.cli.ExitCodes;
import org.elasticsearch.cli.Terminal;
import org.elasticsearch.cli.UserException;
import org.elasticsearch.cluster.ClusterState;
import org.elasticsearch.cluster.metadata.MetaData;
import org.elasticsearch.common.collect.Tuple;
import org.elasticsearch.common.regex.Regex;
import org.elasticsearch.common.settings.Settings;
import org.elasticsearch.env.Environment;
import org.elasticsearch.gateway.PersistedClusterStateService;

import java.io.IOException;
import java.nio.file.Path;
import java.util.List;

public class RemoveSettingsCommand extends ElasticsearchNodeCommand {

static final String SETTINGS_REMOVED_MSG = "Settings were successfully removed from the cluster state";
static final String CONFIRMATION_MSG =
DELIMITER +
"\n" +
"You should only run this tool if you have incompatible settings in the\n" +
"cluster state that prevent the cluster from forming.\n" +
"This tool can cause data loss and its use should be your last resort.\n" +
"\n" +
"Do you want to proceed?\n";

private final OptionSpec<String> arguments;

public RemoveSettingsCommand() {
super("Removes persistent settings from the cluster state");
arguments = parser.nonOptions("setting names");
}

@Override
protected void processNodePaths(Terminal terminal, Path[] dataPaths, OptionSet options, Environment env)
throws IOException, UserException {
final List<String> settingsToRemove = arguments.values(options);
if (settingsToRemove.isEmpty()) {
throw new UserException(ExitCodes.USAGE, "Must supply at least one setting to remove");
}

final PersistedClusterStateService persistedClusterStateService = createPersistedClusterStateService(dataPaths);

terminal.println(Terminal.Verbosity.VERBOSE, "Loading cluster state");
final Tuple<Long, ClusterState> termAndClusterState = loadTermAndClusterState(persistedClusterStateService, env);
final ClusterState oldClusterState = termAndClusterState.v2();
final Settings oldPersistentSettings = oldClusterState.metaData().persistentSettings();
terminal.println(Terminal.Verbosity.VERBOSE, "persistent settings: " + oldPersistentSettings);
final Settings.Builder newPersistentSettingsBuilder = Settings.builder().put(oldPersistentSettings);
for (String settingToRemove : settingsToRemove) {
boolean matched = false;
for (String settingKey : oldPersistentSettings.keySet()) {
if (Regex.simpleMatch(settingToRemove, settingKey)) {
newPersistentSettingsBuilder.remove(settingKey);
if (matched == false) {
terminal.println("The following settings will be removed:");
}
matched = true;
terminal.println(settingKey + ": " + oldPersistentSettings.get(settingKey));
}
}
if (matched == false) {
throw new UserException(ExitCodes.USAGE,
"No persistent cluster settings matching [" + settingToRemove + "] were found on this node");
}
}
final ClusterState newClusterState = ClusterState.builder(oldClusterState)
.metaData(MetaData.builder(oldClusterState.metaData()).persistentSettings(newPersistentSettingsBuilder.build()).build())
.build();
terminal.println(Terminal.Verbosity.VERBOSE,
"[old cluster state = " + oldClusterState + ", new cluster state = " + newClusterState + "]");

confirm(terminal, CONFIRMATION_MSG);

try (PersistedClusterStateService.Writer writer = persistedClusterStateService.createWriter()) {
writer.writeFullStateAndCommit(termAndClusterState.v1(), newClusterState);
}

terminal.println(SETTINGS_REMOVED_MSG);
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,135 @@
/*
* Licensed to Elasticsearch under one or more contributor
* license agreements. See the NOTICE file distributed with
* this work for additional information regarding copyright
* ownership. Elasticsearch licenses this file to you under
* the Apache License, Version 2.0 (the "License"); you may
* not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing,
* software distributed under the License is distributed on an
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
* KIND, either express or implied. See the License for the
* specific language governing permissions and limitations
* under the License.
*/
package org.elasticsearch.cluster.coordination;

import joptsimple.OptionSet;
import org.elasticsearch.ElasticsearchException;
import org.elasticsearch.cli.MockTerminal;
import org.elasticsearch.cli.UserException;
import org.elasticsearch.cluster.routing.allocation.DiskThresholdSettings;
import org.elasticsearch.common.settings.Settings;
import org.elasticsearch.env.Environment;
import org.elasticsearch.env.TestEnvironment;
import org.elasticsearch.test.ESIntegTestCase;

import static org.hamcrest.Matchers.contains;
import static org.hamcrest.Matchers.containsString;
import static org.hamcrest.Matchers.not;

@ESIntegTestCase.ClusterScope(scope = ESIntegTestCase.Scope.TEST, numDataNodes = 0, autoManageMasterNodes = false)
public class RemoveSettingsCommandIT extends ESIntegTestCase {

public void testRemoveSettingsAbortedByUser() throws Exception {
internalCluster().setBootstrapMasterNodeIndex(0);
String node = internalCluster().startNode();
client().admin().cluster().prepareUpdateSettings().setPersistentSettings(Settings.builder()
.put(DiskThresholdSettings.CLUSTER_ROUTING_ALLOCATION_DISK_THRESHOLD_ENABLED_SETTING.getKey(), false).build()).get();
Settings dataPathSettings = internalCluster().dataPathSettings(node);
ensureStableCluster(1);
internalCluster().stopRandomDataNode();

Environment environment = TestEnvironment.newEnvironment(
Settings.builder().put(internalCluster().getDefaultSettings()).put(dataPathSettings).build());
expectThrows(() -> removeSettings(environment, true,
new String[]{ DiskThresholdSettings.CLUSTER_ROUTING_ALLOCATION_DISK_THRESHOLD_ENABLED_SETTING.getKey() }),
ElasticsearchNodeCommand.ABORTED_BY_USER_MSG);
}

public void testRemoveSettingsSuccessful() throws Exception {
internalCluster().setBootstrapMasterNodeIndex(0);
String node = internalCluster().startNode();
client().admin().cluster().prepareUpdateSettings().setPersistentSettings(Settings.builder()
.put(DiskThresholdSettings.CLUSTER_ROUTING_ALLOCATION_DISK_THRESHOLD_ENABLED_SETTING.getKey(), false).build()).get();
assertThat(client().admin().cluster().prepareState().get().getState().metaData().persistentSettings().keySet(),
contains(DiskThresholdSettings.CLUSTER_ROUTING_ALLOCATION_DISK_THRESHOLD_ENABLED_SETTING.getKey()));
Settings dataPathSettings = internalCluster().dataPathSettings(node);
ensureStableCluster(1);
internalCluster().stopRandomDataNode();

Environment environment = TestEnvironment.newEnvironment(
Settings.builder().put(internalCluster().getDefaultSettings()).put(dataPathSettings).build());
MockTerminal terminal = removeSettings(environment, false,
randomBoolean() ?
new String[]{ DiskThresholdSettings.CLUSTER_ROUTING_ALLOCATION_DISK_THRESHOLD_ENABLED_SETTING.getKey() } :
new String[]{ "cluster.routing.allocation.disk.*" }
);
assertThat(terminal.getOutput(), containsString(RemoveSettingsCommand.SETTINGS_REMOVED_MSG));
assertThat(terminal.getOutput(), containsString("The following settings will be removed:"));
assertThat(terminal.getOutput(), containsString(
DiskThresholdSettings.CLUSTER_ROUTING_ALLOCATION_DISK_THRESHOLD_ENABLED_SETTING.getKey() + ": " + false));

internalCluster().startNode(dataPathSettings);
assertThat(client().admin().cluster().prepareState().get().getState().metaData().persistentSettings().keySet(),
not(contains(DiskThresholdSettings.CLUSTER_ROUTING_ALLOCATION_DISK_THRESHOLD_ENABLED_SETTING.getKey())));
}

public void testSettingDoesNotMatch() throws Exception {
internalCluster().setBootstrapMasterNodeIndex(0);
String node = internalCluster().startNode();
client().admin().cluster().prepareUpdateSettings().setPersistentSettings(Settings.builder()
.put(DiskThresholdSettings.CLUSTER_ROUTING_ALLOCATION_DISK_THRESHOLD_ENABLED_SETTING.getKey(), false).build()).get();
assertThat(client().admin().cluster().prepareState().get().getState().metaData().persistentSettings().keySet(),
contains(DiskThresholdSettings.CLUSTER_ROUTING_ALLOCATION_DISK_THRESHOLD_ENABLED_SETTING.getKey()));
Settings dataPathSettings = internalCluster().dataPathSettings(node);
ensureStableCluster(1);
internalCluster().stopRandomDataNode();

Environment environment = TestEnvironment.newEnvironment(
Settings.builder().put(internalCluster().getDefaultSettings()).put(dataPathSettings).build());
UserException ex = expectThrows(UserException.class, () -> removeSettings(environment, false,
new String[]{ "cluster.routing.allocation.disk.bla.*" }));
assertThat(ex.getMessage(), containsString("No persistent cluster settings matching [cluster.routing.allocation.disk.bla.*] were " +
"found on this node"));
}

private MockTerminal executeCommand(ElasticsearchNodeCommand command, Environment environment, boolean abort, String... args)
throws Exception {
final MockTerminal terminal = new MockTerminal();
final OptionSet options = command.getParser().parse(args);
final String input;

if (abort) {
input = randomValueOtherThanMany(c -> c.equalsIgnoreCase("y"), () -> randomAlphaOfLength(1));
} else {
input = randomBoolean() ? "y" : "Y";
}

terminal.addTextInput(input);

try {
command.execute(terminal, options, environment);
} finally {
assertThat(terminal.getOutput(), containsString(ElasticsearchNodeCommand.STOP_WARNING_MSG));
}

return terminal;
}

private MockTerminal removeSettings(Environment environment, boolean abort, String... args) throws Exception {
final MockTerminal terminal = executeCommand(new RemoveSettingsCommand(), environment, abort, args);
assertThat(terminal.getOutput(), containsString(RemoveSettingsCommand.CONFIRMATION_MSG));
assertThat(terminal.getOutput(), containsString(RemoveSettingsCommand.SETTINGS_REMOVED_MSG));
return terminal;
}

private void expectThrows(ThrowingRunnable runnable, String message) {
ElasticsearchException ex = expectThrows(ElasticsearchException.class, runnable);
assertThat(ex.getMessage(), containsString(message));
}
}