Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add decommission capability and removenode cmd #18530

Merged
merged 8 commits into from
Feb 29, 2024

Conversation

lucyge2022
Copy link
Contributor

What changes are proposed in this pull request?

Add decommission capability for worker and removenode cli cmd to remove node from etcd registration

Why are the changes needed?

to add decommission capability for worker

Does this PR introduce any user facing changes?

New cli added for remove worker node

@lucyge2022
Copy link
Contributor Author

addresses #18495 (comment)
FYI @jja725

Optional<WorkerInfo> targetWorker = getAllMembers().getWorkerById(worker.getIdentity());
if (!targetWorker.isPresent()) {
throw new InvalidArgumentException(
String.format("Unrecognized or non-existing worker:%s", worker.getIdentity()));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
String.format("Unrecognized or non-existing worker:%s", worker.getIdentity()));
String.format("Unrecognized or non-existing worker: %s", worker.getIdentity()));

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

// Worker should already be offline
if (targetWorker.get().getState() != WorkerState.LOST) {
throw new InvalidArgumentException(
String.format("Can't decommission running worker:%s, stop the worker"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
String.format("Can't decommission running worker:%s, stop the worker"
String.format("Can't decommission running worker: %s, stop the worker"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

* @return WorkerIdentity object
* @throws InvalidArgumentException
*/
public static WorkerIdentity fromString(String workerIdentityStr)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move this method close to toString so that hopefully people will remember to keep the two implementations compatible when they change either of them in the future. Also please add a unit test to ensure the string representation generated by toString can be correctly parsed by fromString.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done and done


@Override
public String getUsage() {
return "remove";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

usage string looks like getCommandName() + " -n <worker_id> | -h".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for catching, done


@Override
public String getDescription() {
return "Remove given worker from the cluster";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return "Remove given worker from the cluster";
return "Remove a worker from the cluster, so that clients and other workers will not consider the removed worker for services. The worker must have been stopped before it can be safely removed from the cluster.";

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

membershipManager.decommission(
new WorkerInfo().setIdentity(WorkerIdentity.fromString(workerId)));
mPrintStream.println(String.format(
"Successfully decommissioned worker:%s", workerId));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"Successfully decommissioned worker:%s", workerId));
"Successfully decommissioned worker: %s", workerId));

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Comment on lines 132 to 136
String prefix = "worker-";
String errStr = "Unrecognized worker identity string.";
if (!workerIdentityStr.startsWith(prefix)) {
throw new InvalidArgumentException(errStr);
}
Copy link
Contributor

@dbw9580 dbw9580 Feb 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think missing the worker- prefix warrants an error, i.e. both worker-1 and 1 should be considered valid inputs, and the parser should be tolerant of the optional prefix.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i actually wanted to make sure tostring and fromstring are strictly paired, i've also added the warning comment should someone wants to change it.

try {
return ParserV0.INSTANCE.fromLong(Long.parseLong(idStr));
} catch (NumberFormatException ex) {
throw new InvalidArgumentException(errStr);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Echo user input:

Suggested change
throw new InvalidArgumentException(errStr);
throw new InvalidArgumentException(errStr + ": " + workerIdentityStr);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -0,0 +1,97 @@
/*
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we also make changes to nodes.go?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes Rico is making the change

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added now with the running format:
bin/alluxio process remove-worker -n <worker_id>

membershipManager.decommission(
new WorkerInfo().setIdentity(WorkerIdentity.fromString(workerId)));
mPrintStream.println(String.format(
"Successfully decommissioned worker: %s", workerId));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"Successfully decommissioned worker: %s", workerId));
"Successfully removed worker from membership: %s", workerId));

i'd avoid using the word "decommission" from the user facing side as much as possible to avoid confusion

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link
Contributor

@dbw9580 dbw9580 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@Xenorith Xenorith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will you add the golang cli code as part of this PR?

@lucyge2022
Copy link
Contributor Author

will you add the golang cli code as part of this PR?

just realized i haven't pushed yesterday, i just pushed the changes, can u take a quick look ?

@@ -0,0 +1,46 @@
package process
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missing license header in new file

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@lucyge2022 lucyge2022 added the type-feature This issue is a feature request label Feb 29, 2024
@lucyge2022
Copy link
Contributor Author

alluxio-bot, merge this please.

@alluxio-bot alluxio-bot merged commit 1bdd219 into Alluxio:main Feb 29, 2024
14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type-feature This issue is a feature request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants