-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adds "try" and "monitor-retry" options to consul lock
command
#1567
Conversation
Previously, it would try once "up to" the timeout, but in practice it would just fall through. This modifies the behavior to block until the timeout has been reached.
Tweaked things in that last commit so it always waits for the timeout period trying to get the lock vs. trying once "up to" the timeout, which usually just falls through right away. This should better match what users expect. |
pairs, meta, err := kv.List(s.opts.Prefix, opts) | ||
if err != nil { | ||
// TODO (slackpad) - Make a real error type here instead of using | ||
// a string check. | ||
const serverError = "Unexpected response code: 500" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe at least move this out of the retry loop? Maybe we could use like an IsServerError(error)
since I think we do this in a bunch of other places too IIRC.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Haha I was paste-n good call.
Looks like you already addressed the minor comments. This LGTM otherwise 🚢 |
Adds "try" and "monitor-retry" options to `consul lock` command
Question: Is it possible to use the new monitor-retry with a (slightly) larger timeout to facilitate restarting consul servers without damaging the quarum? I'm not sure if there wouldn't be unintended side-affects at first glance, but it seems that if you gave a large enough value to Also, based on the naming of Thanks! |
Hi @IsaacG we didn't design this feature with this use case in mind, but it seems like it could be possible to use it as part of a solution for this. One problem I can see though is that being able to take the lock doesn't necessarily mean that it's safe to take a server down, so doing this in a safe way would require more logic. For example, if one server out of three died during the upgrade and lost the lock, a second one would be able to get it and restart itself, putting the cluster into an outage condition. You'd still want to confirm things like For your second question, I named the |
@slackpad thanks for the comments! In some very initial toying around, I already discovered that the 1 second Also, as you pointed out, there are a lot of rough edges that I'm admittedly completely ignoring for now. I just wanted to start with something as a PoC. |
The "try" option should close #780 and the "monitor-retry" option should close #1162.