rfc: pgwire-compatible query cancellation #75870

rafiss · 2022-02-02T16:59:36Z

Release note: None

cockroach-teamcity · 2022-02-02T16:59:44Z

This change is

knz

Reviewed 1 of 1 files at r1, all commit messages.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @petermattis, and @rafiss)

docs/RFCS/20220202_pgwire_compatible_cancel.md, line 16 at r1 (raw file):

## Summary

The Postgres query cancel protocol is hard to implement since it only uses 64

I think the summary needs to start with a sentence that says what the query cancel protocol is and why it's important.

docs/RFCS/20220202_pgwire_compatible_cancel.md, line 19 at r1 (raw file):

bits of data as an identifier, and is sent over a separate (unauthenticated)
connection, different from the SQL connection. For dedicated clusters, these 64
bits of data need to identify a node and query to cancel. We can use 32 bits to

node and session. The query is implicit (current query).

Ben says we should think about the query, but I disagree. This is explained here:

https://github.com/cockroachdb/cockroach/blob/master/docs/RFCS/20220122_cancel_rfc.md#target-the-query-in-the-request-not-the-session

docs/RFCS/20220202_pgwire_compatible_cancel.md, line 91 at r1 (raw file):

#### SQL Node Changes

The connExecutor will be updated to generate a random 32-bit integer (secretID)

Technically, we only need a secretID per connExec that's associated with a client SQL session, not for those used inside InternalExecutor.

docs/RFCS/20220202_pgwire_compatible_cancel.md, line 97 at r1 (raw file):

BackendKeyData that is sent back to the client.

The status server will have a new endpoint named PGWireCancelQuery, analogous to

I'd call it CancelQueryByKey, no need to say "pg" in there.

docs/RFCS/20220202_pgwire_compatible_cancel.md, line 108 at r1 (raw file):

cancellation requests. If a cancel request fails, which is likely to only happen
if an attacker is spamming requests, it will be penalized by holding onto the
semaphore for an extra second. If we set the concurrency of the semaphore to 256

You have not explained what this semaphore is yet at this point.

Also, as explained in the PR review already, I think the semaphore should be at the ingress, on the first sql server that receives the cancel request over pgwire, not in the target handler of the RPC; otherwise an attacker can flood the pod-to-pod network with bogus requests.

See https://github.com/cockroachdb/cockroach/pull/75211/files#diff-761942b79b747d28c579a01c4862c3a8300605c8be65b2a6e52709760849abc9R32

docs/RFCS/20220202_pgwire_compatible_cancel.md, line 138 at r1 (raw file):

When a proxy sees a CancelRequest, it first extracts the IP address component of
the message. If needed, it will forward the request to the proxy with that
address using an RPC that is only for proxy-proxy communication. This RPC

This proxy-to-proxy communication is new. The RFC should highlight this, and contain a sub-section that explains how proxies will authenticate themselves to each other.

petermattis

I skimmed this RFC at it matches up with my understanding of what was discussed elsewhere. I'll leave the detailed review to others.

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner and @rafiss)

ajwerner

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @rafiss)

docs/RFCS/20220202_pgwire_compatible_cancel.md, line 94 at r1 (raw file):

when it is initialized, and register it with the server’s
[SessionRegistry](https://github.com/cockroachdb/cockroach/blob/a434c8418c36dbeb64e73588bcd4dd5b248c3238/pkg/sql/conn_executor.go#L1692).
The 32-bit SQLInstanceID and the random 32 bit secretID will be used as the

In a multi-tenant setting, SQLInstanceID should never be more than 15 bits. We need this to be true for unique_rowid to remain sound. We have an outstanding problem whereby we use the node ID as the instance ID indiscriminately. We potentially ought to change the dedicated servers to utilize the same leasing protocol used in multi-tenant settings for unique_rowid() and potentially this as well. If we do so, we can have a bunch more bits for the secret, which will make the argument more compelling. I suggest we leave that as future work, because it's a lot to stake on this project right now, but it doesn't seem too far off.

rafiss

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @knz)

docs/RFCS/20220202_pgwire_compatible_cancel.md, line 16 at r1 (raw file):

Previously, knz (kena) wrote…

I think the summary needs to start with a sentence that says what the query cancel protocol is and why it's important.

good point

docs/RFCS/20220202_pgwire_compatible_cancel.md, line 19 at r1 (raw file):

Previously, knz (kena) wrote…

node and session. The query is implicit (current query).

Ben says we should think about the query, but I disagree. This is explained here:

https://github.com/cockroachdb/cockroach/blob/master/docs/RFCS/20220122_cancel_rfc.md#target-the-query-in-the-request-not-the-session

nice catch

docs/RFCS/20220202_pgwire_compatible_cancel.md, line 91 at r1 (raw file):

Previously, knz (kena) wrote…

Technically, we only need a secretID per connExec that's associated with a client SQL session, not for those used inside InternalExecutor.

true

docs/RFCS/20220202_pgwire_compatible_cancel.md, line 94 at r1 (raw file):

Previously, ajwerner wrote…

In a multi-tenant setting, SQLInstanceID should never be more than 15 bits. We need this to be true for unique_rowid to remain sound. We have an outstanding problem whereby we use the node ID as the instance ID indiscriminately. We potentially ought to change the dedicated servers to utilize the same leasing protocol used in multi-tenant settings for unique_rowid() and potentially this as well. If we do so, we can have a bunch more bits for the secret, which will make the argument more compelling. I suggest we leave that as future work, because it's a lot to stake on this project right now, but it doesn't seem too far off.

i think we can still use more bits in basically all cases, without needing to do that future work. Thanks to @knz for the idea back in his initial review of my PR from 6 months ago

I'm including something like this in the next revision of my original PR

const (
	lower32BitsMask uint64 = 0x00000000FFFFFFFF
	lower52BitsMask uint64 = 0x000FFFFFFFFFFFFF
	leadingBitMask         = 1 << 63
)

func MakeBackendKeyData(rng *rand.Rand, sqlInstanceID base.SQLInstanceID) BackendKeyData {
	ret := rng.Uint64()
	if sqlInstanceID < 1<<11 {
		// Only keep the lower 52 bits
		ret = ret & lower52BitsMask
		// Set the upper 12 bits based on the sqlInstanceID.
		ret = ret | (uint64(sqlInstanceID) << 52)
	} else {
		// Only keep the lower 32 bits.
		ret = ret & lower32BitsMask
		// Set the leading bit.
		ret = ret | leadingBitMask
		// Set the other upper 31 bits based on the sqlInstanceID.
		ret = ret | (uint64(sqlInstanceID) << 32)
	}
	return BackendKeyData(ret)
}

docs/RFCS/20220202_pgwire_compatible_cancel.md, line 97 at r1 (raw file):

Previously, knz (kena) wrote…

I'd call it CancelQueryByKey, no need to say "pg" in there.

donce

docs/RFCS/20220202_pgwire_compatible_cancel.md, line 108 at r1 (raw file):

Previously, knz (kena) wrote…

You have not explained what this semaphore is yet at this point.

Also, as explained in the PR review already, I think the semaphore should be at the ingress, on the first sql server that receives the cancel request over pgwire, not in the target handler of the RPC; otherwise an attacker can flood the pod-to-pod network with bogus requests.

See https://github.com/cockroachdb/cockroach/pull/75211/files#diff-761942b79b747d28c579a01c4862c3a8300605c8be65b2a6e52709760849abc9R32

definitely agree about adding the semaphore at the ingress. i'm already incorporating that in the next revision of my original PR, which i'll publish soon

will update RFC

docs/RFCS/20220202_pgwire_compatible_cancel.md, line 138 at r1 (raw file):

Previously, knz (kena) wrote…

This proxy-to-proxy communication is new. The RFC should highlight this, and contain a sub-section that explains how proxies will authenticate themselves to each other.

will add a proxy-proxy communication highlight.

i'm not sure how proxies will authenticate themselves to each other. would it be sufficient to say that they don't need to if they are all in the same private subnet? (proxies aren't exposed to the internet, since they are behind a cloud load balancer)

knz

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @knz and @rafiss)

docs/RFCS/20220202_pgwire_compatible_cancel.md, line 138 at r1 (raw file):

would it be sufficient to say that they don't need to if they are all in the same private subnet?

heh, no. Defense in depth, remember? If private subnets were sufficient, we wouldn't care about using TLS throughout the CC infrastructure. And yet we do.

rafiss

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @knz)

docs/RFCS/20220202_pgwire_compatible_cancel.md, line 138 at r1 (raw file):

Previously, knz (kena) wrote…

would it be sufficient to say that they don't need to if they are all in the same private subnet?

heh, no. Defense in depth, remember? If private subnets were sufficient, we wouldn't care about using TLS throughout the CC infrastructure. And yet we do.

got it, added more details now

otan

things here make sense, must've been a lot of discussions :)

i think a few diagrams here would make it easier to digest.

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @knz and @rafiss)

docs/RFCS/20220202_pgwire_compatible_cancel.md, line 11 at r3 (raw file):

The proposal here is entirely dependent on many thoughtful discussions and prior
work with Jordan Lewis, knz, Andrew Werner, Andy Kimball, Peter Mattis, Ben

haha that's not a full name :')

docs/RFCS/20220202_pgwire_compatible_cancel.md, line 48 at r3 (raw file):

Nearly all Postgres drivers support the Postgres query cancellation protocol.
For example, the PGJDBC
[setQueryTimeout](https://jdbc.postgresql.org/documentation/publicapi/org/postgresql/jdbc/PgStatement.html#setQueryTimeout-int-)

nit: if you want to avoid these long lines you could use the markdown thing where you put links down at the bottom.

docs/RFCS/20220202_pgwire_compatible_cancel.md, line 101 at r3 (raw file):

set to the SQLInstanceID. Note that SQLInstanceIDs are always positive, so it's
safe to use the leading bit in this way. This BackendKeyData is sent back to the
client.

it would be great if we could have a visualisation of how the 64 bit space is used (but i am a very visual person so up to you)

docs/RFCS/20220202_pgwire_compatible_cancel.md, line 113 at r3 (raw file):

Since this endpoint is unauthenticated, before performing any business logic, it
will use a semaphore to guard the number of concurrent cancellation requests. If

are we penalised if the query just succeeded before the CANCEL QUERY request?
not saying it's too important, but i can imagine a crazy-ish case where say there's a blocked query, we issue 3000 cancels to all existing queries, blocked query gets cancelled, everything else succeeds but we still gotta wait 3000/concurrency seconds. very hypothetical and not important, just curious.

docs/RFCS/20220202_pgwire_compatible_cancel.md, line 138 at r3 (raw file):

to the server.

When a proxy sees a BackendKeyData, it will generate a new random 32-bit secret

it would be great to get a sequence diagram of the protocol here

knz

with nit about clarifying semaphore semantics.

I'll let oliver finish iterating with you.

Reviewed 1 of 1 files at r3, all commit messages.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @knz, @otan, and @rafiss)

docs/RFCS/20220202_pgwire_compatible_cancel.md, line 113 at r3 (raw file):

Previously, otan (Oliver Tan) wrote…

are we penalised if the query just succeeded before the CANCEL QUERY request?
not saying it's too important, but i can imagine a crazy-ish case where say there's a blocked query, we issue 3000 cancels to all existing queries, blocked query gets cancelled, everything else succeeds but we still gotta wait 3000/concurrency seconds. very hypothetical and not important, just curious.

good point. I see in the code it's a TryAcquire, not Acquire, but that's worth explaining here.

rafiss

Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @knz and @otan)

docs/RFCS/20220202_pgwire_compatible_cancel.md, line 11 at r3 (raw file):

Previously, otan (Oliver Tan) wrote…

haha that's not a full name :')

:D

docs/RFCS/20220202_pgwire_compatible_cancel.md, line 48 at r3 (raw file):

Previously, otan (Oliver Tan) wrote…

nit: if you want to avoid these long lines you could use the markdown thing where you put links down at the bottom.

eh i'm not a huge fan of that actually

docs/RFCS/20220202_pgwire_compatible_cancel.md, line 101 at r3 (raw file):

Previously, otan (Oliver Tan) wrote…

it would be great if we could have a visualisation of how the 64 bit space is used (but i am a very visual person so up to you)

nice idea

docs/RFCS/20220202_pgwire_compatible_cancel.md, line 113 at r3 (raw file):

Previously, knz (kena) wrote…

good point. I see in the code it's a TryAcquire, not Acquire, but that's worth explaining here.

adding more details

docs/RFCS/20220202_pgwire_compatible_cancel.md, line 138 at r3 (raw file):

Previously, otan (Oliver Tan) wrote…

it would be great to get a sequence diagram of the protocol here

will try to add something. let's see how my text diagram skills are

Release note: None

rafiss · 2022-02-09T19:53:11Z

thanks for the reviews all. I've added a few diagrams as suggested.

bors r=otan,knz

craig · 2022-02-09T21:07:31Z

Build succeeded:

GitHub CI (Cockroach)

67501: sql/pgwire: implement pgwire query cancellation r=knz a=rafiss closes #41335 See individual commits for more details. This adds a BackendKeyData struct which identifies a session and SQLInstance running that session. Clients send it to cancel a request, and it's handled by a new CancelQueryByBackendKeyData endpoint in the status server. The cancellation gets forwarded to the correct node, and there is a per-node rate limit to prevent a DoS attack. Also, refer to the RFC: #75870 This PR only adds support for non-multitenant clusters. Refer to the RFC for additional work needed for serverless/multitenant support. Co-authored-by: Rafi Shamim <[email protected]>

rafiss requested review from knz, ajwerner, petermattis and a team February 2, 2022 16:59

rafiss requested a review from a team as a code owner February 2, 2022 16:59

rafiss force-pushed the rfc-pgwire-cancel branch from 0a76f92 to 6df50d1 Compare February 2, 2022 17:08

rafiss mentioned this pull request Feb 2, 2022

sql/pgwire: implement pgwire query cancellation #67501

Merged

knz reviewed Feb 4, 2022

View reviewed changes

petermattis reviewed Feb 5, 2022

View reviewed changes

ajwerner reviewed Feb 5, 2022

View reviewed changes

rafiss commented Feb 6, 2022

View reviewed changes

rafiss force-pushed the rfc-pgwire-cancel branch from 6df50d1 to bcfaf5f Compare February 6, 2022 15:24

knz reviewed Feb 7, 2022

View reviewed changes

rafiss force-pushed the rfc-pgwire-cancel branch from bcfaf5f to 3078823 Compare February 7, 2022 17:49

rafiss commented Feb 7, 2022

View reviewed changes

rafiss requested a review from a team February 8, 2022 22:42

otan reviewed Feb 9, 2022

View reviewed changes

knz approved these changes Feb 9, 2022

View reviewed changes

rafiss commented Feb 9, 2022

View reviewed changes

rfc: pgwire-compatible query cancellation

bfb9b8b

Release note: None

rafiss force-pushed the rfc-pgwire-cancel branch from 3078823 to bfb9b8b Compare February 9, 2022 19:49

blathers-crl bot requested a review from otan February 9, 2022 19:49

rafiss mentioned this pull request Feb 9, 2022

sql/pgwire: add metrics for pgwire cancel requests #76332

Closed

craig bot merged commit c7446c5 into cockroachdb:master Feb 9, 2022

rafiss deleted the rfc-pgwire-cancel branch February 10, 2022 03:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

rfc: pgwire-compatible query cancellation #75870

rfc: pgwire-compatible query cancellation #75870

rafiss commented Feb 2, 2022

cockroach-teamcity commented Feb 2, 2022

knz left a comment

petermattis left a comment

ajwerner left a comment

rafiss left a comment

knz left a comment

rafiss left a comment

otan left a comment

knz left a comment

rafiss left a comment

rafiss commented Feb 9, 2022

craig bot commented Feb 9, 2022

rfc: pgwire-compatible query cancellation #75870

rfc: pgwire-compatible query cancellation #75870

Conversation

rafiss commented Feb 2, 2022

cockroach-teamcity commented Feb 2, 2022

knz left a comment

Choose a reason for hiding this comment

petermattis left a comment

Choose a reason for hiding this comment

ajwerner left a comment

Choose a reason for hiding this comment

rafiss left a comment

Choose a reason for hiding this comment

knz left a comment

Choose a reason for hiding this comment

rafiss left a comment

Choose a reason for hiding this comment

otan left a comment

Choose a reason for hiding this comment

knz left a comment

Choose a reason for hiding this comment

rafiss left a comment

Choose a reason for hiding this comment

rafiss commented Feb 9, 2022

craig bot commented Feb 9, 2022