-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pgwire: implement cancellation protocol #34520
Closed
Closed
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The original id is 128 bits. I'd like a comment here showing that using only 64 of the bits is secure. Presumably the 128 was chosen for a good reason and for some reason 64 wasn't good enough. So, why is it ok here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, the cluster-wide ID needs to be unique across the entire cluster. This ID only needs to be unique on a single node, so I figured you could just use the timestamp part of the cluster-wide ID. Maybe I'm wrong though, thoughts?
This commit wouldn't support canceling queries on node A by sending the message to node B, but judging by the lib/pq implementation that's not needed or expected anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it theoretically possible for two identical wall times to be used for two different connections on the same node? If so then one connection could incorrectly cancel another connection. I'm not sure exactly how the hlc timestamp is generated here, or if this is possible or not. Seems like it might be because of the logical part. Like if one node's clock gets too ahead or behind (I'm not sure which) of the others in the cluster then...it's hlc clock will like not progress its wall time but increment it's logical right? In that case two connections during that time period will have the same wall and could hit this. If this can't happen I'd like a comment explaining why.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm very science dog here, but after the absence of this PR annoyed me for the millionth time, I looked. So we need 64 bit but our IDs are 128bit, and that's making us uneasy, right?
https://github.com/cockroachdb/vendored/blob/223d2a8a5a0cc5fe1268e7942b4db86798bed955/github.com/lib/pq/conn_go18.go#L134-L137
We have to keep the NodeID for load-balanced cancellation to work. NodeIDs are int32 but they're always positive, so we have So we have 16+32 bits left. If we fill the 48 bits with true pseudo-randomness (i.e. don't pick part of the timestamp here), wouldn't that be good enough? We can even boost it up if we dedicate the first bit to deciding whether the NodeID is uvarint encoded or not. NodeIDs are almost certainly nowhere close to MaxInt32 which gives us extra bits in the expected case. But honestly, that seems over the top. 48 bits of randomness should be plenty and at the same time cheap enough to generate.
I haven't looked but I'm guessing that we like that our ClusterWideIDs contain relevant data (?), so my suggestion would be to just tack these random 48bit on locally (they're not shared with other nodes).
To handle a cancel() request, the receiving node first checks if the NodeID matches, in which case it cancels locally. Otherwise it proxies the cancel() over to the node that is supposed to handle it.
Would love for this to get finished. We write this code way too often:
query := SET statement_timeout='5s'; SELECT
or worse, forget, and things just hang if they didn't work out the way they ought to.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your math is off. The sign is just 1 bit. There's just 33 bits left, not 48.
The rest of the argument relative to randomness feels weird to me. Your argument stands trivially if we are guaranteed that all queries terminate nearly instantaneously. But what happens in case a query is long-running? The chance of conflict increases.
We can probably compute the probability based on a distribution of latencies and the frequency of new queries. Unfortunately I don't have the math at the top of my head. Do you remember how that works?
Yes that bit is correct (and necessary when load balancers are involved).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just looked and
CustomCancel
allows us to just callconn.Cancel()
which is exactly what I think we'd want to do throughout all CRDB-related uses of this.I don't have a strong case for implementing the cancel protocol with this option.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like the
CustomCancel
config was removed in the latest version ofpgx
, so we should avoid introducing a dependency on it. This is discussed in jackc/pgx#679, which is also a generally interesting discussion on PgBouncer's handling of the cancel protocol: jackc/pgx#679 (comment). @andreimatei might be interested in that.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CustomCancel
is removed in the latest version because that version now closes the connection upon context cancellation. That's great! It's even better and we could use that to fix our test behavior.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Go isn't the only language that uses the cancel protocol though! Let's not make the mistake of blinding ourselves to other client behavior - even if Go's cancel works well for our purposes, other languages still use the cancel protocol.
See PGJDBC, the main Java driver: https://github.com/pgjdbc/pgjdbc/blob/14576f4bca3a2484fd4f81a0d8276ae5cab9a419/pgjdbc/src/main/java/org/postgresql/core/QueryExecutorBase.java#L164
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Jordan my comment above is driven by context: the reason why Tobias (and then myself) got interested in this recently is because it's annoying our own test suite -- which is focused on Go, lib/pq and hopefully soon pgx.
For the wider picture, unfortunately as established above we are still in a bind and there's no good solution in sight. This conclusion is further substantiated by the discussion on pgx's own repository ,see the PR linked by Nathan above.
As per my comment at #34520 (comment) as well as Matt's at #34520 (comment)
I personally wouldn't spend time on this further than needed to advance our own testing agenda. The design of postgres' cancel protocol is just very bad and trying to do something here will create a lot of complexity on our side (with unclear gain).